[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-08-18 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-1404:


Status: Resolved  (was: Patch Available)
Resolution: Fixed

Patch checked in.  Thanks Romain for all your work on this.

 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
Assignee: Romain Rigaux
 Fix For: 0.8.0

 Attachments: commons-lang-2.4.jar, PIG-1404-2.patch, 
 PIG-1404-3-doc.patch, PIG-1404-3.patch, PIG-1404-4-doc.patch, 
 PIG-1404-4.patch, PIG-1404-5.patch, PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 No cluster set up is required.
 For example:
 TestCase
 {code}
   @Test
   public void testTop3Queries() {
 String[] args = {
 n=3,
 };
 test = new PigTest(top_queries.pig, args);
 String[] input = {
 yahoo\t10,
 twitter\t7,
 facebook\t10,
 yahoo\t15,
 facebook\t5,
 
 };
 String[] output = {
 (yahoo,25L),
 (facebook,15L),
 (twitter,7L),
 };
 test.assertOutput(data, input, queries_limit, output);
   }
 {code}
 top_queries.pig
 {code}
 data =
 LOAD '$input'
 AS (query:CHARARRAY, count:INT);
  
 ... 
 
 queries_sum = 
 FOREACH queries_group 
 GENERATE 
 group AS query, 
 SUM(queries.count) AS count;
 
 ...
 
 queries_limit = LIMIT queries_ordered $n;
 STORE queries_limit INTO '$output';
 {code}
 They are 3 modes:
 * LOCAL (if pigunit.exectype.local properties is present)
 * MAPREDUCE (use the cluster specified in the classpath, same as 
 HADOOP_CONF_DIR)
 ** automatic mini cluster (is the default and the HADOOP_CONF_DIR to have in 
 the class path will be: ~/pigtest/conf)
 ** pointing to an existing cluster (if pigunit.exectype.cluster properties 
 is present)
 For now, it would be nice to see how this idea could be integrated in 
 Piggybank and if PigParser/PigServer could improve their interfaces in order 
 to make PigUnit simple.
 Other components based on PigUnit could be built later:
   - standalone MiniCluster
   - notion of workspaces for each test
   - standalone utility that reads test configuration and generates a test 
 report...
 It is a first prototype, open to suggestions and can definitely take 
 advantage of feedbacks.
 How to test, in pig_trunk:
 {code}
 Apply patch
 $pig_trunk ant compile-test
 $pig_trunk ant
 $pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99
 {code}
 (it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
 future between 'unit' and 'integration')
 Many examples are in:
 {code}
 contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java
 {code}
 When used as a standalone, do not forget commons-lang-2.4.jar and the 
 HADOOP_CONF_DIR to your cluster in your CLASSPATH.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-08-17 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-1404:


Attachment: PIG-1404-5.patch

A rework of the patch that moves the code from pigunit to test.  

To answer Romain's question, I have put the code under 
test/org/apache/pig/pigunit and tests for PigUnit under 
test/org/apache/pig/test/pigunit.

I have added pigunit-jar test-pigunit targets to the top level build.xml to 
build pigunit.jar and run the PigUnit tests.

I have also updated the documentation with the new source locations and 
included that documentation in this patch.

 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
Assignee: Romain Rigaux
 Fix For: 0.8.0

 Attachments: commons-lang-2.4.jar, PIG-1404-2.patch, 
 PIG-1404-3-doc.patch, PIG-1404-3.patch, PIG-1404-4-doc.patch, 
 PIG-1404-4.patch, PIG-1404-5.patch, PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 No cluster set up is required.
 For example:
 TestCase
 {code}
   @Test
   public void testTop3Queries() {
 String[] args = {
 n=3,
 };
 test = new PigTest(top_queries.pig, args);
 String[] input = {
 yahoo\t10,
 twitter\t7,
 facebook\t10,
 yahoo\t15,
 facebook\t5,
 
 };
 String[] output = {
 (yahoo,25L),
 (facebook,15L),
 (twitter,7L),
 };
 test.assertOutput(data, input, queries_limit, output);
   }
 {code}
 top_queries.pig
 {code}
 data =
 LOAD '$input'
 AS (query:CHARARRAY, count:INT);
  
 ... 
 
 queries_sum = 
 FOREACH queries_group 
 GENERATE 
 group AS query, 
 SUM(queries.count) AS count;
 
 ...
 
 queries_limit = LIMIT queries_ordered $n;
 STORE queries_limit INTO '$output';
 {code}
 They are 3 modes:
 * LOCAL (if pigunit.exectype.local properties is present)
 * MAPREDUCE (use the cluster specified in the classpath, same as 
 HADOOP_CONF_DIR)
 ** automatic mini cluster (is the default and the HADOOP_CONF_DIR to have in 
 the class path will be: ~/pigtest/conf)
 ** pointing to an existing cluster (if pigunit.exectype.cluster properties 
 is present)
 For now, it would be nice to see how this idea could be integrated in 
 Piggybank and if PigParser/PigServer could improve their interfaces in order 
 to make PigUnit simple.
 Other components based on PigUnit could be built later:
   - standalone MiniCluster
   - notion of workspaces for each test
   - standalone utility that reads test configuration and generates a test 
 report...
 It is a first prototype, open to suggestions and can definitely take 
 advantage of feedbacks.
 How to test, in pig_trunk:
 {code}
 Apply patch
 $pig_trunk ant compile-test
 $pig_trunk ant
 $pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99
 {code}
 (it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
 future between 'unit' and 'integration')
 Many examples are in:
 {code}
 contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java
 {code}
 When used as a standalone, do not forget commons-lang-2.4.jar and the 
 HADOOP_CONF_DIR to your cluster in your CLASSPATH.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-08-17 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-1404:


Attachment: (was: PIG-1404-5.patch)

 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
Assignee: Romain Rigaux
 Fix For: 0.8.0

 Attachments: commons-lang-2.4.jar, PIG-1404-2.patch, 
 PIG-1404-3-doc.patch, PIG-1404-3.patch, PIG-1404-4-doc.patch, 
 PIG-1404-4.patch, PIG-1404-5.patch, PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 No cluster set up is required.
 For example:
 TestCase
 {code}
   @Test
   public void testTop3Queries() {
 String[] args = {
 n=3,
 };
 test = new PigTest(top_queries.pig, args);
 String[] input = {
 yahoo\t10,
 twitter\t7,
 facebook\t10,
 yahoo\t15,
 facebook\t5,
 
 };
 String[] output = {
 (yahoo,25L),
 (facebook,15L),
 (twitter,7L),
 };
 test.assertOutput(data, input, queries_limit, output);
   }
 {code}
 top_queries.pig
 {code}
 data =
 LOAD '$input'
 AS (query:CHARARRAY, count:INT);
  
 ... 
 
 queries_sum = 
 FOREACH queries_group 
 GENERATE 
 group AS query, 
 SUM(queries.count) AS count;
 
 ...
 
 queries_limit = LIMIT queries_ordered $n;
 STORE queries_limit INTO '$output';
 {code}
 They are 3 modes:
 * LOCAL (if pigunit.exectype.local properties is present)
 * MAPREDUCE (use the cluster specified in the classpath, same as 
 HADOOP_CONF_DIR)
 ** automatic mini cluster (is the default and the HADOOP_CONF_DIR to have in 
 the class path will be: ~/pigtest/conf)
 ** pointing to an existing cluster (if pigunit.exectype.cluster properties 
 is present)
 For now, it would be nice to see how this idea could be integrated in 
 Piggybank and if PigParser/PigServer could improve their interfaces in order 
 to make PigUnit simple.
 Other components based on PigUnit could be built later:
   - standalone MiniCluster
   - notion of workspaces for each test
   - standalone utility that reads test configuration and generates a test 
 report...
 It is a first prototype, open to suggestions and can definitely take 
 advantage of feedbacks.
 How to test, in pig_trunk:
 {code}
 Apply patch
 $pig_trunk ant compile-test
 $pig_trunk ant
 $pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99
 {code}
 (it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
 future between 'unit' and 'integration')
 Many examples are in:
 {code}
 contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java
 {code}
 When used as a standalone, do not forget commons-lang-2.4.jar and the 
 HADOOP_CONF_DIR to your cluster in your CLASSPATH.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-08-17 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-1404:


Status: Patch Available  (was: Open)

 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
Assignee: Romain Rigaux
 Fix For: 0.8.0

 Attachments: commons-lang-2.4.jar, PIG-1404-2.patch, 
 PIG-1404-3-doc.patch, PIG-1404-3.patch, PIG-1404-4-doc.patch, 
 PIG-1404-4.patch, PIG-1404-5.patch, PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 No cluster set up is required.
 For example:
 TestCase
 {code}
   @Test
   public void testTop3Queries() {
 String[] args = {
 n=3,
 };
 test = new PigTest(top_queries.pig, args);
 String[] input = {
 yahoo\t10,
 twitter\t7,
 facebook\t10,
 yahoo\t15,
 facebook\t5,
 
 };
 String[] output = {
 (yahoo,25L),
 (facebook,15L),
 (twitter,7L),
 };
 test.assertOutput(data, input, queries_limit, output);
   }
 {code}
 top_queries.pig
 {code}
 data =
 LOAD '$input'
 AS (query:CHARARRAY, count:INT);
  
 ... 
 
 queries_sum = 
 FOREACH queries_group 
 GENERATE 
 group AS query, 
 SUM(queries.count) AS count;
 
 ...
 
 queries_limit = LIMIT queries_ordered $n;
 STORE queries_limit INTO '$output';
 {code}
 They are 3 modes:
 * LOCAL (if pigunit.exectype.local properties is present)
 * MAPREDUCE (use the cluster specified in the classpath, same as 
 HADOOP_CONF_DIR)
 ** automatic mini cluster (is the default and the HADOOP_CONF_DIR to have in 
 the class path will be: ~/pigtest/conf)
 ** pointing to an existing cluster (if pigunit.exectype.cluster properties 
 is present)
 For now, it would be nice to see how this idea could be integrated in 
 Piggybank and if PigParser/PigServer could improve their interfaces in order 
 to make PigUnit simple.
 Other components based on PigUnit could be built later:
   - standalone MiniCluster
   - notion of workspaces for each test
   - standalone utility that reads test configuration and generates a test 
 report...
 It is a first prototype, open to suggestions and can definitely take 
 advantage of feedbacks.
 How to test, in pig_trunk:
 {code}
 Apply patch
 $pig_trunk ant compile-test
 $pig_trunk ant
 $pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99
 {code}
 (it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
 future between 'unit' and 'integration')
 Many examples are in:
 {code}
 contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java
 {code}
 When used as a standalone, do not forget commons-lang-2.4.jar and the 
 HADOOP_CONF_DIR to your cluster in your CLASSPATH.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-08-03 Thread Romain Rigaux (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Rigaux updated PIG-1404:
---

Status: Open  (was: Patch Available)

 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
Assignee: Romain Rigaux
 Fix For: 0.8.0

 Attachments: commons-lang-2.4.jar, PIG-1404-2.patch, 
 PIG-1404-3-doc.patch, PIG-1404-3.patch, PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 No cluster set up is required.
 For example:
 TestCase
 {code}
   @Test
   public void testTop3Queries() {
 String[] args = {
 n=3,
 };
 test = new PigTest(top_queries.pig, args);
 String[] input = {
 yahoo\t10,
 twitter\t7,
 facebook\t10,
 yahoo\t15,
 facebook\t5,
 
 };
 String[] output = {
 (yahoo,25L),
 (facebook,15L),
 (twitter,7L),
 };
 test.assertOutput(data, input, queries_limit, output);
   }
 {code}
 top_queries.pig
 {code}
 data =
 LOAD '$input'
 AS (query:CHARARRAY, count:INT);
  
 ... 
 
 queries_sum = 
 FOREACH queries_group 
 GENERATE 
 group AS query, 
 SUM(queries.count) AS count;
 
 ...
 
 queries_limit = LIMIT queries_ordered $n;
 STORE queries_limit INTO '$output';
 {code}
 They are 3 modes:
 * LOCAL (if pigunit.exectype.local properties is present)
 * MAPREDUCE (use the cluster specified in the classpath, same as 
 HADOOP_CONF_DIR)
 ** automatic mini cluster (is the default and the HADOOP_CONF_DIR to have in 
 the class path will be: ~/pigtest/conf)
 ** pointing to an existing cluster (if pigunit.exectype.cluster properties 
 is present)
 For now, it would be nice to see how this idea could be integrated in 
 Piggybank and if PigParser/PigServer could improve their interfaces in order 
 to make PigUnit simple.
 Other components based on PigUnit could be built later:
   - standalone MiniCluster
   - notion of workspaces for each test
   - standalone utility that reads test configuration and generates a test 
 report...
 It is a first prototype, open to suggestions and can definitely take 
 advantage of feedbacks.
 How to test, in pig_trunk:
 {code}
 Apply patch
 $pig_trunk ant compile-test
 $pig_trunk ant
 $pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99
 {code}
 (it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
 future between 'unit' and 'integration')
 Many examples are in:
 {code}
 contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java
 {code}
 When used as a standalone, do not forget commons-lang-2.4.jar and the 
 HADOOP_CONF_DIR to your cluster in your CLASSPATH.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-08-03 Thread Romain Rigaux (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Rigaux updated PIG-1404:
---

Attachment: PIG-1404-4.patch
PIG-1404-4-doc.patch

Hi everyone,

Here is a slightly updated version where the default execution is Pig local 
mode (so no need to add any HADOOP_CONF_DIR in the classpath by default).

Then regarding PigUnit, do you see it:
  - as a version 1 that can be committed soon (and could be found in either 
pig.jar, pigunit.jar or piggybank.jar...)?
  - a proof of concept for a more sophisticated Pig testing framework?
 
Feel free to give feedbacks. I will also have time for it in August.

Thanks,

Romain

 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
Assignee: Romain Rigaux
 Fix For: 0.8.0

 Attachments: commons-lang-2.4.jar, PIG-1404-2.patch, 
 PIG-1404-3-doc.patch, PIG-1404-3.patch, PIG-1404-4-doc.patch, 
 PIG-1404-4.patch, PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 No cluster set up is required.
 For example:
 TestCase
 {code}
   @Test
   public void testTop3Queries() {
 String[] args = {
 n=3,
 };
 test = new PigTest(top_queries.pig, args);
 String[] input = {
 yahoo\t10,
 twitter\t7,
 facebook\t10,
 yahoo\t15,
 facebook\t5,
 
 };
 String[] output = {
 (yahoo,25L),
 (facebook,15L),
 (twitter,7L),
 };
 test.assertOutput(data, input, queries_limit, output);
   }
 {code}
 top_queries.pig
 {code}
 data =
 LOAD '$input'
 AS (query:CHARARRAY, count:INT);
  
 ... 
 
 queries_sum = 
 FOREACH queries_group 
 GENERATE 
 group AS query, 
 SUM(queries.count) AS count;
 
 ...
 
 queries_limit = LIMIT queries_ordered $n;
 STORE queries_limit INTO '$output';
 {code}
 They are 3 modes:
 * LOCAL (if pigunit.exectype.local properties is present)
 * MAPREDUCE (use the cluster specified in the classpath, same as 
 HADOOP_CONF_DIR)
 ** automatic mini cluster (is the default and the HADOOP_CONF_DIR to have in 
 the class path will be: ~/pigtest/conf)
 ** pointing to an existing cluster (if pigunit.exectype.cluster properties 
 is present)
 For now, it would be nice to see how this idea could be integrated in 
 Piggybank and if PigParser/PigServer could improve their interfaces in order 
 to make PigUnit simple.
 Other components based on PigUnit could be built later:
   - standalone MiniCluster
   - notion of workspaces for each test
   - standalone utility that reads test configuration and generates a test 
 report...
 It is a first prototype, open to suggestions and can definitely take 
 advantage of feedbacks.
 How to test, in pig_trunk:
 {code}
 Apply patch
 $pig_trunk ant compile-test
 $pig_trunk ant
 $pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99
 {code}
 (it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
 future between 'unit' and 'integration')
 Many examples are in:
 {code}
 contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java
 {code}
 When used as a standalone, do not forget commons-lang-2.4.jar and the 
 HADOOP_CONF_DIR to your cluster in your CLASSPATH.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-07-02 Thread Romain Rigaux (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Rigaux updated PIG-1404:
---

Status: Open  (was: Patch Available)

 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
Assignee: Romain Rigaux
 Fix For: 0.8.0

 Attachments: commons-lang-2.4.jar, PIG-1404-2.patch, 
 PIG-1404-3-doc.patch, PIG-1404-3.patch, PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 No cluster set up is required.
 For example:
 TestCase
 {code}
   @Test
   public void testTop3Queries() {
 String[] args = {
 n=3,
 };
 test = new PigTest(top_queries.pig, args);
 String[] input = {
 yahoo\t10,
 twitter\t7,
 facebook\t10,
 yahoo\t15,
 facebook\t5,
 
 };
 String[] output = {
 (yahoo,25L),
 (facebook,15L),
 (twitter,7L),
 };
 test.assertOutput(data, input, queries_limit, output);
   }
 {code}
 top_queries.pig
 {code}
 data =
 LOAD '$input'
 AS (query:CHARARRAY, count:INT);
  
 ... 
 
 queries_sum = 
 FOREACH queries_group 
 GENERATE 
 group AS query, 
 SUM(queries.count) AS count;
 
 ...
 
 queries_limit = LIMIT queries_ordered $n;
 STORE queries_limit INTO '$output';
 {code}
 They are 3 modes:
 * LOCAL (if pigunit.exectype.local properties is present)
 * MAPREDUCE (use the cluster specified in the classpath, same as 
 HADOOP_CONF_DIR)
 ** automatic mini cluster (is the default and the HADOOP_CONF_DIR to have in 
 the class path will be: ~/pigtest/conf)
 ** pointing to an existing cluster (if pigunit.exectype.cluster properties 
 is present)
 For now, it would be nice to see how this idea could be integrated in 
 Piggybank and if PigParser/PigServer could improve their interfaces in order 
 to make PigUnit simple.
 Other components based on PigUnit could be built later:
   - standalone MiniCluster
   - notion of workspaces for each test
   - standalone utility that reads test configuration and generates a test 
 report...
 It is a first prototype, open to suggestions and can definitely take 
 advantage of feedbacks.
 How to test, in pig_trunk:
 {code}
 Apply patch
 $pig_trunk ant compile-test
 $pig_trunk ant
 $pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99
 {code}
 (it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
 future between 'unit' and 'integration')
 Many examples are in:
 {code}
 contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java
 {code}
 When used as a standalone, do not forget commons-lang-2.4.jar and the 
 HADOOP_CONF_DIR to your cluster in your CLASSPATH.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-07-02 Thread Romain Rigaux (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Rigaux updated PIG-1404:
---

Status: Patch Available  (was: Open)

 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
Assignee: Romain Rigaux
 Fix For: 0.8.0

 Attachments: commons-lang-2.4.jar, PIG-1404-2.patch, 
 PIG-1404-3-doc.patch, PIG-1404-3.patch, PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 No cluster set up is required.
 For example:
 TestCase
 {code}
   @Test
   public void testTop3Queries() {
 String[] args = {
 n=3,
 };
 test = new PigTest(top_queries.pig, args);
 String[] input = {
 yahoo\t10,
 twitter\t7,
 facebook\t10,
 yahoo\t15,
 facebook\t5,
 
 };
 String[] output = {
 (yahoo,25L),
 (facebook,15L),
 (twitter,7L),
 };
 test.assertOutput(data, input, queries_limit, output);
   }
 {code}
 top_queries.pig
 {code}
 data =
 LOAD '$input'
 AS (query:CHARARRAY, count:INT);
  
 ... 
 
 queries_sum = 
 FOREACH queries_group 
 GENERATE 
 group AS query, 
 SUM(queries.count) AS count;
 
 ...
 
 queries_limit = LIMIT queries_ordered $n;
 STORE queries_limit INTO '$output';
 {code}
 They are 3 modes:
 * LOCAL (if pigunit.exectype.local properties is present)
 * MAPREDUCE (use the cluster specified in the classpath, same as 
 HADOOP_CONF_DIR)
 ** automatic mini cluster (is the default and the HADOOP_CONF_DIR to have in 
 the class path will be: ~/pigtest/conf)
 ** pointing to an existing cluster (if pigunit.exectype.cluster properties 
 is present)
 For now, it would be nice to see how this idea could be integrated in 
 Piggybank and if PigParser/PigServer could improve their interfaces in order 
 to make PigUnit simple.
 Other components based on PigUnit could be built later:
   - standalone MiniCluster
   - notion of workspaces for each test
   - standalone utility that reads test configuration and generates a test 
 report...
 It is a first prototype, open to suggestions and can definitely take 
 advantage of feedbacks.
 How to test, in pig_trunk:
 {code}
 Apply patch
 $pig_trunk ant compile-test
 $pig_trunk ant
 $pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99
 {code}
 (it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
 future between 'unit' and 'integration')
 Many examples are in:
 {code}
 contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java
 {code}
 When used as a standalone, do not forget commons-lang-2.4.jar and the 
 HADOOP_CONF_DIR to your cluster in your CLASSPATH.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-06-30 Thread Romain Rigaux (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Rigaux updated PIG-1404:
---

Attachment: PIG-1404-3.patch
PIG-1404-3-doc.patch

Sorry it took some time but here is an updated patch, with some improvements, 
that simplifies a little bit more PigUnit.

I also added some documentation but I guess this should be reviewed in another 
patch when this one could be committed.

 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
Assignee: Romain Rigaux
 Fix For: 0.8.0

 Attachments: commons-lang-2.4.jar, PIG-1404-2.patch, 
 PIG-1404-3-doc.patch, PIG-1404-3.patch, PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 No cluster set up is required.
 For example:
 TestCase
 {code}
   @Test
   public void testTop3Queries() {
 String[] args = {
 n=3,
 };
 test = new PigTest(top_queries.pig, args);
 String[] input = {
 yahoo\t10,
 twitter\t7,
 facebook\t10,
 yahoo\t15,
 facebook\t5,
 
 };
 String[] output = {
 (yahoo,25L),
 (facebook,15L),
 (twitter,7L),
 };
 test.assertOutput(data, input, queries_limit, output);
   }
 {code}
 top_queries.pig
 {code}
 data =
 LOAD '$input'
 AS (query:CHARARRAY, count:INT);
  
 ... 
 
 queries_sum = 
 FOREACH queries_group 
 GENERATE 
 group AS query, 
 SUM(queries.count) AS count;
 
 ...
 
 queries_limit = LIMIT queries_ordered $n;
 STORE queries_limit INTO '$output';
 {code}
 They are 3 modes:
 * LOCAL (if pigunit.exectype.local properties is present)
 * MAPREDUCE (use the cluster specified in the classpath, same as 
 HADOOP_CONF_DIR)
 ** automatic mini cluster (is the default and the HADOOP_CONF_DIR to have in 
 the class path will be: ~/pigtest/conf)
 ** pointing to an existing cluster (if pigunit.exectype.cluster properties 
 is present)
 For now, it would be nice to see how this idea could be integrated in 
 Piggybank and if PigParser/PigServer could improve their interfaces in order 
 to make PigUnit simple.
 Other components based on PigUnit could be built later:
   - standalone MiniCluster
   - notion of workspaces for each test
   - standalone utility that reads test configuration and generates a test 
 report...
 It is a first prototype, open to suggestions and can definitely take 
 advantage of feedbacks.
 How to test, in pig_trunk:
 {code}
 Apply patch
 $pig_trunk ant compile-test
 $pig_trunk ant
 $pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99
 {code}
 (it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
 future between 'unit' and 'integration')
 Many examples are in:
 {code}
 contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java
 {code}
 When used as a standalone, do not forget commons-lang-2.4.jar and the 
 HADOOP_CONF_DIR to your cluster in your CLASSPATH.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-05-18 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-1404:


Attachment: PIG-1404-2.patch

A rework of the patch to move it from piggybank to a separate pigunit directory.

 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
 Fix For: 0.8.0

 Attachments: commons-lang-2.4.jar, PIG-1404-2.patch, PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 No cluster set up is required.
 For example:
 TestCase
 {code}
   @Test
   public void testTop3Queries() {
 String[] args = {
 n=3,
 };
 test = new PigTest(top_queries.pig, args);
 String[] input = {
 yahoo\t10,
 twitter\t7,
 facebook\t10,
 yahoo\t15,
 facebook\t5,
 
 };
 String[] output = {
 (yahoo,25L),
 (facebook,15L),
 (twitter,7L),
 };
 test.assertOutput(data, input, queries_limit, output);
   }
 {code}
 top_queries.pig
 {code}
 data =
 LOAD '$input'
 AS (query:CHARARRAY, count:INT);
  
 ... 
 
 queries_sum = 
 FOREACH queries_group 
 GENERATE 
 group AS query, 
 SUM(queries.count) AS count;
 
 ...
 
 queries_limit = LIMIT queries_ordered $n;
 STORE queries_limit INTO '$output';
 {code}
 They are 3 modes:
 * LOCAL (if pigunit.exectype.local properties is present)
 * MAPREDUCE (use the cluster specified in the classpath, same as 
 HADOOP_CONF_DIR)
 ** automatic mini cluster (is the default and the HADOOP_CONF_DIR to have in 
 the class path will be: ~/pigtest/conf)
 ** pointing to an existing cluster (if pigunit.exectype.cluster properties 
 is present)
 For now, it would be nice to see how this idea could be integrated in 
 Piggybank and if PigParser/PigServer could improve their interfaces in order 
 to make PigUnit simple.
 Other components based on PigUnit could be built later:
   - standalone MiniCluster
   - notion of workspaces for each test
   - standalone utility that reads test configuration and generates a test 
 report...
 It is a first prototype, open to suggestions and can definitely take 
 advantage of feedbacks.
 How to test, in pig_trunk:
 {code}
 Apply patch
 $pig_trunk ant compile-test
 $pig_trunk ant
 $pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99
 {code}
 (it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
 future between 'unit' and 'integration')
 Many examples are in:
 {code}
 contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java
 {code}
 When used as a standalone, do not forget commons-lang-2.4.jar and the 
 HADOOP_CONF_DIR to your cluster in your CLASSPATH.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-05-18 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-1404:


Status: Patch Available  (was: Open)

 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
Assignee: Romain Rigaux
 Fix For: 0.8.0

 Attachments: commons-lang-2.4.jar, PIG-1404-2.patch, PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 No cluster set up is required.
 For example:
 TestCase
 {code}
   @Test
   public void testTop3Queries() {
 String[] args = {
 n=3,
 };
 test = new PigTest(top_queries.pig, args);
 String[] input = {
 yahoo\t10,
 twitter\t7,
 facebook\t10,
 yahoo\t15,
 facebook\t5,
 
 };
 String[] output = {
 (yahoo,25L),
 (facebook,15L),
 (twitter,7L),
 };
 test.assertOutput(data, input, queries_limit, output);
   }
 {code}
 top_queries.pig
 {code}
 data =
 LOAD '$input'
 AS (query:CHARARRAY, count:INT);
  
 ... 
 
 queries_sum = 
 FOREACH queries_group 
 GENERATE 
 group AS query, 
 SUM(queries.count) AS count;
 
 ...
 
 queries_limit = LIMIT queries_ordered $n;
 STORE queries_limit INTO '$output';
 {code}
 They are 3 modes:
 * LOCAL (if pigunit.exectype.local properties is present)
 * MAPREDUCE (use the cluster specified in the classpath, same as 
 HADOOP_CONF_DIR)
 ** automatic mini cluster (is the default and the HADOOP_CONF_DIR to have in 
 the class path will be: ~/pigtest/conf)
 ** pointing to an existing cluster (if pigunit.exectype.cluster properties 
 is present)
 For now, it would be nice to see how this idea could be integrated in 
 Piggybank and if PigParser/PigServer could improve their interfaces in order 
 to make PigUnit simple.
 Other components based on PigUnit could be built later:
   - standalone MiniCluster
   - notion of workspaces for each test
   - standalone utility that reads test configuration and generates a test 
 report...
 It is a first prototype, open to suggestions and can definitely take 
 advantage of feedbacks.
 How to test, in pig_trunk:
 {code}
 Apply patch
 $pig_trunk ant compile-test
 $pig_trunk ant
 $pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99
 {code}
 (it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
 future between 'unit' and 'integration')
 Many examples are in:
 {code}
 contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java
 {code}
 When used as a standalone, do not forget commons-lang-2.4.jar and the 
 HADOOP_CONF_DIR to your cluster in your CLASSPATH.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-05-11 Thread Romain Rigaux (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Rigaux updated PIG-1404:
---

Attachment: PIG-1404.patch

Some cleaning and commenting.

 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
 Fix For: 0.8.0

 Attachments: commons-lang-2.4.jar, PIG-1404.patch, PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 No cluster set up is required.
 For example:
 TestCase
 {code}
   @Test
   public void testTop3Queries() {
 String[] args = {
 n=3,
 };
 test = new PigTest(top_queries.pig, args);
 String[] input = {
 yahoo\t10,
 twitter\t7,
 facebook\t10,
 yahoo\t15,
 facebook\t5,
 
 };
 String[] output = {
 (yahoo,25L),
 (facebook,15L),
 (twitter,7L),
 };
 test.assertOutput(data, input, queries_limit, output);
   }
 {code}
 top_queries.pig
 {code}
 data =
 LOAD '$input'
 AS (query:CHARARRAY, count:INT);
  
 ... 
 
 queries_sum = 
 FOREACH queries_group 
 GENERATE 
 group AS query, 
 SUM(queries.count) AS count;
 
 ...
 
 queries_limit = LIMIT queries_ordered $n;
 STORE queries_limit INTO '$output';
 {code}
 They are 3 modes:
 * LOCAL (if pigunit.exectype.local properties is present)
 * MAPREDUCE (use the cluster specified in the classpath, same as 
 HADOOP_CONF_DIR)
 ** automatic mini cluster (is the default and the HADOOP_CONF_DIR to have in 
 the class path will be: ~/pigtest/conf)
 ** pointing to an existing cluster (if pigunit.exectype.cluster properties 
 is present)
 For now, it would be nice to see how this idea could be integrated in 
 Piggybank and if PigParser/PigServer could improve their interfaces in order 
 to make PigUnit simple.
 Other components based on PigUnit could be built later:
   - standalone MiniCluster
   - notion of workspaces for each test
   - standalone utility that reads test configuration and generates a test 
 report...
 It is a first prototype, open to suggestions and can definitely take 
 advantage of feedbacks.
 How to test, in pig_trunk:
 {code}
 Apply patch
 $pig_trunk ant compile-test
 $pig_trunk ant
 $pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99
 {code}
 (it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
 future between 'unit' and 'integration')
 Many examples are in:
 {code}
 contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java
 {code}
 When used as a standalone, do not forget commons-lang-2.4.jar and the 
 HADOOP_CONF_DIR to your cluster in your CLASSPATH.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-05-11 Thread Romain Rigaux (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Rigaux updated PIG-1404:
---

Attachment: (was: PIG-1404.patch)

 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
 Fix For: 0.8.0

 Attachments: commons-lang-2.4.jar, PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 No cluster set up is required.
 For example:
 TestCase
 {code}
   @Test
   public void testTop3Queries() {
 String[] args = {
 n=3,
 };
 test = new PigTest(top_queries.pig, args);
 String[] input = {
 yahoo\t10,
 twitter\t7,
 facebook\t10,
 yahoo\t15,
 facebook\t5,
 
 };
 String[] output = {
 (yahoo,25L),
 (facebook,15L),
 (twitter,7L),
 };
 test.assertOutput(data, input, queries_limit, output);
   }
 {code}
 top_queries.pig
 {code}
 data =
 LOAD '$input'
 AS (query:CHARARRAY, count:INT);
  
 ... 
 
 queries_sum = 
 FOREACH queries_group 
 GENERATE 
 group AS query, 
 SUM(queries.count) AS count;
 
 ...
 
 queries_limit = LIMIT queries_ordered $n;
 STORE queries_limit INTO '$output';
 {code}
 They are 3 modes:
 * LOCAL (if pigunit.exectype.local properties is present)
 * MAPREDUCE (use the cluster specified in the classpath, same as 
 HADOOP_CONF_DIR)
 ** automatic mini cluster (is the default and the HADOOP_CONF_DIR to have in 
 the class path will be: ~/pigtest/conf)
 ** pointing to an existing cluster (if pigunit.exectype.cluster properties 
 is present)
 For now, it would be nice to see how this idea could be integrated in 
 Piggybank and if PigParser/PigServer could improve their interfaces in order 
 to make PigUnit simple.
 Other components based on PigUnit could be built later:
   - standalone MiniCluster
   - notion of workspaces for each test
   - standalone utility that reads test configuration and generates a test 
 report...
 It is a first prototype, open to suggestions and can definitely take 
 advantage of feedbacks.
 How to test, in pig_trunk:
 {code}
 Apply patch
 $pig_trunk ant compile-test
 $pig_trunk ant
 $pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99
 {code}
 (it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
 future between 'unit' and 'integration')
 Many examples are in:
 {code}
 contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java
 {code}
 When used as a standalone, do not forget commons-lang-2.4.jar and the 
 HADOOP_CONF_DIR to your cluster in your CLASSPATH.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-05-03 Thread Romain Rigaux (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Rigaux updated PIG-1404:
---

Attachment: PIG-1404.patch

First prototype.

 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
 Fix For: 0.8.0

 Attachments: PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 For example:
 TestCase
 {code}
   @Test
   public void testTop3Queries() {
 String[] args = {
 n=3,
 };
 test = new PigTest(top_queries.pig, args);
 String[] input = {
 yahoo\t10,
 twitter\t7,
 facebook\t10,
 yahoo\t15,
 facebook\t5,
 
 };
 String[] output = {
 (yahoo,25L),
 (facebook,15L),
 (twitter,7L),
 };
 test.assertOutput(data, input, queries_limit, output);
   }
 {code}
 top_queries.pig
 {code}
 data =
 LOAD '$input'
 AS (query:CHARARRAY, count:INT);
  
 ... 
 
 queries_sum = 
 FOREACH queries_group 
 GENERATE 
 group AS query, 
 SUM(queries.count) AS count;
 
 ...
 
 queries_limit = LIMIT queries_ordered $n;
 STORE queries_limit INTO '$output';
 {code}
 They are 3 modes:
- LOCAL (if pigunit.exectype.local properties is present)
- MAPREDUCE (use the cluster specified in the classpath, same as 
 HADOOP_CONF_DIR)
  - automatic mini cluster (default)
  - pointing to an existing cluster (if pigunit.exectype.cluster 
 properties is present)
 For now, it would be nice to see how this idea could be integrated in 
 Piggybank and if PigParser/PigServer could improve their interfaces in order 
 to make PigUnit simple.
 Other components based on PigUnit could be built later:
   - standalone MiniCluster
   - notion of workspaces for each test
   - standalone utility that reads test configuration and generates a test 
 report...
 It is a first prototype, open to suggestions and can definitely take 
 advantage of feedbacks.
 How to test, in pig_trunk
 Apply patch
 $pig_trunk ant
 $pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99
 (it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
 future between 'unit' and 'integration')
 Many examples are in:
 contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-05-03 Thread Romain Rigaux (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Rigaux updated PIG-1404:
---

Attachment: (was: PIG-1404.patch)

 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
 Fix For: 0.8.0


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 For example:
 TestCase
 {code}
   @Test
   public void testTop3Queries() {
 String[] args = {
 n=3,
 };
 test = new PigTest(top_queries.pig, args);
 String[] input = {
 yahoo\t10,
 twitter\t7,
 facebook\t10,
 yahoo\t15,
 facebook\t5,
 
 };
 String[] output = {
 (yahoo,25L),
 (facebook,15L),
 (twitter,7L),
 };
 test.assertOutput(data, input, queries_limit, output);
   }
 {code}
 top_queries.pig
 {code}
 data =
 LOAD '$input'
 AS (query:CHARARRAY, count:INT);
  
 ... 
 
 queries_sum = 
 FOREACH queries_group 
 GENERATE 
 group AS query, 
 SUM(queries.count) AS count;
 
 ...
 
 queries_limit = LIMIT queries_ordered $n;
 STORE queries_limit INTO '$output';
 {code}
 They are 3 modes:
- LOCAL (if pigunit.exectype.local properties is present)
- MAPREDUCE (use the cluster specified in the classpath, same as 
 HADOOP_CONF_DIR)
  - automatic mini cluster (default)
  - pointing to an existing cluster (if pigunit.exectype.cluster 
 properties is present)
 For now, it would be nice to see how this idea could be integrated in 
 Piggybank and if PigParser/PigServer could improve their interfaces in order 
 to make PigUnit simple.
 Other components based on PigUnit could be built later:
   - standalone MiniCluster
   - notion of workspaces for each test
   - standalone utility that reads test configuration and generates a test 
 report...
 It is a first prototype, open to suggestions and can definitely take 
 advantage of feedbacks.
 How to test, in pig_trunk
 Apply patch
 $pig_trunk ant
 $pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99
 (it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
 future between 'unit' and 'integration')
 Many examples are in:
 contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-05-03 Thread Romain Rigaux (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Rigaux updated PIG-1404:
---

Attachment: PIG-1404.patch

First prototype.

 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
 Fix For: 0.8.0

 Attachments: PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 For example:
 TestCase
 {code}
   @Test
   public void testTop3Queries() {
 String[] args = {
 n=3,
 };
 test = new PigTest(top_queries.pig, args);
 String[] input = {
 yahoo\t10,
 twitter\t7,
 facebook\t10,
 yahoo\t15,
 facebook\t5,
 
 };
 String[] output = {
 (yahoo,25L),
 (facebook,15L),
 (twitter,7L),
 };
 test.assertOutput(data, input, queries_limit, output);
   }
 {code}
 top_queries.pig
 {code}
 data =
 LOAD '$input'
 AS (query:CHARARRAY, count:INT);
  
 ... 
 
 queries_sum = 
 FOREACH queries_group 
 GENERATE 
 group AS query, 
 SUM(queries.count) AS count;
 
 ...
 
 queries_limit = LIMIT queries_ordered $n;
 STORE queries_limit INTO '$output';
 {code}
 They are 3 modes:
- LOCAL (if pigunit.exectype.local properties is present)
- MAPREDUCE (use the cluster specified in the classpath, same as 
 HADOOP_CONF_DIR)
  - automatic mini cluster (default)
  - pointing to an existing cluster (if pigunit.exectype.cluster 
 properties is present)
 For now, it would be nice to see how this idea could be integrated in 
 Piggybank and if PigParser/PigServer could improve their interfaces in order 
 to make PigUnit simple.
 Other components based on PigUnit could be built later:
   - standalone MiniCluster
   - notion of workspaces for each test
   - standalone utility that reads test configuration and generates a test 
 report...
 It is a first prototype, open to suggestions and can definitely take 
 advantage of feedbacks.
 How to test, in pig_trunk
 Apply patch
 $pig_trunk ant
 $pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99
 (it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
 future between 'unit' and 'integration')
 Many examples are in:
 contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-05-03 Thread Romain Rigaux (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Rigaux updated PIG-1404:
---

Description: 
The goal is to provide a simple xUnit framework that enables our Pig scripts to 
be easily:
  - unit tested
  - regression tested
  - quickly prototyped

For example:

TestCase
{code}
  @Test
  public void testTop3Queries() {
String[] args = {
n=3,
};
test = new PigTest(top_queries.pig, args);

String[] input = {
yahoo\t10,
twitter\t7,
facebook\t10,
yahoo\t15,
facebook\t5,

};

String[] output = {
(yahoo,25L),
(facebook,15L),
(twitter,7L),
};

test.assertOutput(data, input, queries_limit, output);
  }
{code}

top_queries.pig
{code}
data =
LOAD '$input'
AS (query:CHARARRAY, count:INT);
 
... 

queries_sum = 
FOREACH queries_group 
GENERATE 
group AS query, 
SUM(queries.count) AS count;

...

queries_limit = LIMIT queries_ordered $n;

STORE queries_limit INTO '$output';
{code}

They are 3 modes:
   - LOCAL (if pigunit.exectype.local properties is present)
   - MAPREDUCE (use the cluster specified in the classpath, same as 
HADOOP_CONF_DIR)
  - automatic mini cluster (default)
  - pointing to an existing cluster (if pigunit.exectype.cluster 
properties is present)

For now, it would be nice to see how this idea could be integrated in Piggybank 
and if PigParser/PigServer could improve their interfaces in order to make 
PigUnit simple.

Other components based on PigUnit could be built later:
  - standalone MiniCluster
  - notion of workspaces for each test
  - standalone utility that reads test configuration and generates a test 
report...

It is a first prototype, open to suggestions and can definitely take advantage 
of feedbacks.

How to test, in pig_trunk
Apply patch
$pig_trunk ant
$pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99

(it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
future between 'unit' and 'integration')

Many examples are in:
contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java



  was:
The goal is to provide a simple xUnit framework that enables our Pig scripts to 
be easily:
  - unit tested
  - regression tested
  - quickly prototyped

For example:

TestCase
{code}
  @Test
  public void testTop3Queries() {
String[] args = {
n=3,
};
test = new PigTest(top_queries.pig, args);

String[] input = {
yahoo\t10,
twitter\t7,
facebook\t10,
yahoo\t15,
facebook\t5,

};

String[] output = {
(yahoo,25L),
(facebook,15L),
(twitter,7L),
};

test.assertOutput(data, input, queries_limit, output);
  }
{code}

top_queries.pig
{code}
data =
LOAD '$input'
AS (query:CHARARRAY, count:INT);
 
... 

queries_sum = 
FOREACH queries_group 
GENERATE 
group AS query, 
SUM(queries.count) AS count;

...

queries_limit = LIMIT queries_ordered $n;

STORE queries_limit INTO '$output';
{code}

They are 3 modes:
   - LOCAL (if pigunit.exectype.local properties is present)
   - MAPREDUCE (use the cluster specified in the classpath, same as 
HADOOP_CONF_DIR)
 - automatic mini cluster (default)
 - pointing to an existing cluster (if pigunit.exectype.cluster 
properties is present)

For now, it would be nice to see how this idea could be integrated in Piggybank 
and if PigParser/PigServer could improve their interfaces in order to make 
PigUnit simple.

Other components based on PigUnit could be built later:
  - standalone MiniCluster
  - notion of workspaces for each test
  - standalone utility that reads test configuration and generates a test 
report...

It is a first prototype, open to suggestions and can definitely take advantage 
of feedbacks.

How to test, in pig_trunk
Apply patch
$pig_trunk ant
$pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99

(it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
future between 'unit' and 'integration')

Many examples are in:
contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java




 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
 Fix For: 0.8.0

 Attachments: PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 For example:
 TestCase
 {code}
   @Test

[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-05-03 Thread Romain Rigaux (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Rigaux updated PIG-1404:
---

Description: 
The goal is to provide a simple xUnit framework that enables our Pig scripts to 
be easily:
  - unit tested
  - regression tested
  - quickly prototyped

For example:

TestCase
{code}
  @Test
  public void testTop3Queries() {
String[] args = {
n=3,
};
test = new PigTest(top_queries.pig, args);

String[] input = {
yahoo\t10,
twitter\t7,
facebook\t10,
yahoo\t15,
facebook\t5,

};

String[] output = {
(yahoo,25L),
(facebook,15L),
(twitter,7L),
};

test.assertOutput(data, input, queries_limit, output);
  }
{code}

top_queries.pig
{code}
data =
LOAD '$input'
AS (query:CHARARRAY, count:INT);
 
... 

queries_sum = 
FOREACH queries_group 
GENERATE 
group AS query, 
SUM(queries.count) AS count;

...

queries_limit = LIMIT queries_ordered $n;

STORE queries_limit INTO '$output';
{code}

They are 3 modes:
* LOCAL (if pigunit.exectype.local properties is present)
* MAPREDUCE (use the cluster specified in the classpath, same as 
HADOOP_CONF_DIR)
** automatic mini cluster (default)
** pointing to an existing cluster (if pigunit.exectype.cluster properties is 
present)

For now, it would be nice to see how this idea could be integrated in Piggybank 
and if PigParser/PigServer could improve their interfaces in order to make 
PigUnit simple.

Other components based on PigUnit could be built later:
  - standalone MiniCluster
  - notion of workspaces for each test
  - standalone utility that reads test configuration and generates a test 
report...

It is a first prototype, open to suggestions and can definitely take advantage 
of feedbacks.

How to test, in pig_trunk
Apply patch
$pig_trunk ant
$pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99

(it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
future between 'unit' and 'integration')

Many examples are in:
contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java



  was:
The goal is to provide a simple xUnit framework that enables our Pig scripts to 
be easily:
  - unit tested
  - regression tested
  - quickly prototyped

For example:

TestCase
{code}
  @Test
  public void testTop3Queries() {
String[] args = {
n=3,
};
test = new PigTest(top_queries.pig, args);

String[] input = {
yahoo\t10,
twitter\t7,
facebook\t10,
yahoo\t15,
facebook\t5,

};

String[] output = {
(yahoo,25L),
(facebook,15L),
(twitter,7L),
};

test.assertOutput(data, input, queries_limit, output);
  }
{code}

top_queries.pig
{code}
data =
LOAD '$input'
AS (query:CHARARRAY, count:INT);
 
... 

queries_sum = 
FOREACH queries_group 
GENERATE 
group AS query, 
SUM(queries.count) AS count;

...

queries_limit = LIMIT queries_ordered $n;

STORE queries_limit INTO '$output';
{code}

They are 3 modes:
   - LOCAL (if pigunit.exectype.local properties is present)
   - MAPREDUCE (use the cluster specified in the classpath, same as 
HADOOP_CONF_DIR)
  - automatic mini cluster (default)
  - pointing to an existing cluster (if pigunit.exectype.cluster 
properties is present)

For now, it would be nice to see how this idea could be integrated in Piggybank 
and if PigParser/PigServer could improve their interfaces in order to make 
PigUnit simple.

Other components based on PigUnit could be built later:
  - standalone MiniCluster
  - notion of workspaces for each test
  - standalone utility that reads test configuration and generates a test 
report...

It is a first prototype, open to suggestions and can definitely take advantage 
of feedbacks.

How to test, in pig_trunk
Apply patch
$pig_trunk ant
$pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99

(it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
future between 'unit' and 'integration')

Many examples are in:
contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java




 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
 Fix For: 0.8.0

 Attachments: PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 For example:
 TestCase
 {code}
   @Test
   public 

[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-05-03 Thread Romain Rigaux (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Rigaux updated PIG-1404:
---

Description: 
The goal is to provide a simple xUnit framework that enables our Pig scripts to 
be easily:
  - unit tested
  - regression tested
  - quickly prototyped

For example:

TestCase
{code}
  @Test
  public void testTop3Queries() {
String[] args = {
n=3,
};
test = new PigTest(top_queries.pig, args);

String[] input = {
yahoo\t10,
twitter\t7,
facebook\t10,
yahoo\t15,
facebook\t5,

};

String[] output = {
(yahoo,25L),
(facebook,15L),
(twitter,7L),
};

test.assertOutput(data, input, queries_limit, output);
  }
{code}

top_queries.pig
{code}
data =
LOAD '$input'
AS (query:CHARARRAY, count:INT);
 
... 

queries_sum = 
FOREACH queries_group 
GENERATE 
group AS query, 
SUM(queries.count) AS count;

...

queries_limit = LIMIT queries_ordered $n;

STORE queries_limit INTO '$output';
{code}

They are 3 modes:
* LOCAL (if pigunit.exectype.local properties is present)
* MAPREDUCE (use the cluster specified in the classpath, same as 
HADOOP_CONF_DIR)
** automatic mini cluster (default)
** pointing to an existing cluster (if pigunit.exectype.cluster properties is 
present)

For now, it would be nice to see how this idea could be integrated in Piggybank 
and if PigParser/PigServer could improve their interfaces in order to make 
PigUnit simple.

Other components based on PigUnit could be built later:
  - standalone MiniCluster
  - notion of workspaces for each test
  - standalone utility that reads test configuration and generates a test 
report...

It is a first prototype, open to suggestions and can definitely take advantage 
of feedbacks.

How to test, in pig_trunk:
{code}
Apply patch
$pig_trunk ant
$pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99
{code}

(it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
future between 'unit' and 'integration')

Many examples are in:
{code}
contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java
{code}


  was:
The goal is to provide a simple xUnit framework that enables our Pig scripts to 
be easily:
  - unit tested
  - regression tested
  - quickly prototyped

For example:

TestCase
{code}
  @Test
  public void testTop3Queries() {
String[] args = {
n=3,
};
test = new PigTest(top_queries.pig, args);

String[] input = {
yahoo\t10,
twitter\t7,
facebook\t10,
yahoo\t15,
facebook\t5,

};

String[] output = {
(yahoo,25L),
(facebook,15L),
(twitter,7L),
};

test.assertOutput(data, input, queries_limit, output);
  }
{code}

top_queries.pig
{code}
data =
LOAD '$input'
AS (query:CHARARRAY, count:INT);
 
... 

queries_sum = 
FOREACH queries_group 
GENERATE 
group AS query, 
SUM(queries.count) AS count;

...

queries_limit = LIMIT queries_ordered $n;

STORE queries_limit INTO '$output';
{code}

They are 3 modes:
* LOCAL (if pigunit.exectype.local properties is present)
* MAPREDUCE (use the cluster specified in the classpath, same as 
HADOOP_CONF_DIR)
** automatic mini cluster (default)
** pointing to an existing cluster (if pigunit.exectype.cluster properties is 
present)

For now, it would be nice to see how this idea could be integrated in Piggybank 
and if PigParser/PigServer could improve their interfaces in order to make 
PigUnit simple.

Other components based on PigUnit could be built later:
  - standalone MiniCluster
  - notion of workspaces for each test
  - standalone utility that reads test configuration and generates a test 
report...

It is a first prototype, open to suggestions and can definitely take advantage 
of feedbacks.

How to test, in pig_trunk
Apply patch
$pig_trunk ant
$pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99

(it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
future between 'unit' and 'integration')

Many examples are in:
contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java




 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
 Fix For: 0.8.0

 Attachments: PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 For example:
 TestCase
 {code}
   @Test
  

[jira] Updated: (PIG-1404) PigUnit - Pig script testing simplified.

2010-05-03 Thread Romain Rigaux (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Rigaux updated PIG-1404:
---

Attachment: commons-lang-2.4.jar

piggybank.jar needs org.apache.commons.lang.StringUtils for now.

 PigUnit - Pig script testing simplified. 
 -

 Key: PIG-1404
 URL: https://issues.apache.org/jira/browse/PIG-1404
 Project: Pig
  Issue Type: New Feature
Reporter: Romain Rigaux
 Fix For: 0.8.0

 Attachments: commons-lang-2.4.jar, PIG-1404.patch


 The goal is to provide a simple xUnit framework that enables our Pig scripts 
 to be easily:
   - unit tested
   - regression tested
   - quickly prototyped
 For example:
 TestCase
 {code}
   @Test
   public void testTop3Queries() {
 String[] args = {
 n=3,
 };
 test = new PigTest(top_queries.pig, args);
 String[] input = {
 yahoo\t10,
 twitter\t7,
 facebook\t10,
 yahoo\t15,
 facebook\t5,
 
 };
 String[] output = {
 (yahoo,25L),
 (facebook,15L),
 (twitter,7L),
 };
 test.assertOutput(data, input, queries_limit, output);
   }
 {code}
 top_queries.pig
 {code}
 data =
 LOAD '$input'
 AS (query:CHARARRAY, count:INT);
  
 ... 
 
 queries_sum = 
 FOREACH queries_group 
 GENERATE 
 group AS query, 
 SUM(queries.count) AS count;
 
 ...
 
 queries_limit = LIMIT queries_ordered $n;
 STORE queries_limit INTO '$output';
 {code}
 They are 3 modes:
 * LOCAL (if pigunit.exectype.local properties is present)
 * MAPREDUCE (use the cluster specified in the classpath, same as 
 HADOOP_CONF_DIR)
 ** automatic mini cluster (default)
 ** pointing to an existing cluster (if pigunit.exectype.cluster properties 
 is present)
 For now, it would be nice to see how this idea could be integrated in 
 Piggybank and if PigParser/PigServer could improve their interfaces in order 
 to make PigUnit simple.
 Other components based on PigUnit could be built later:
   - standalone MiniCluster
   - notion of workspaces for each test
   - standalone utility that reads test configuration and generates a test 
 report...
 It is a first prototype, open to suggestions and can definitely take 
 advantage of feedbacks.
 How to test, in pig_trunk:
 {code}
 Apply patch
 $pig_trunk ant compile-test
 $pig_trunk ant
 If you use the MiniCluster, the HADOOP_CONF_DIR to have in the class path 
 will be: ~/pigtest/conf.
 $pig_trunk/contrib/piggybank/java ant test -Dtest.timeout=99
 {code}
 (it takes 15 min in MAPREDUCE minicluster, tests will need to be split in the 
 future between 'unit' and 'integration')
 Many examples are in:
 {code}
 contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/pigunit/TestPigTest.java
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.