[jira] Commented: (PIG-1441) New test targets: unit and smoke

2010-06-08 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876913#action_12876913
 ] 

Daniel Dai commented on PIG-1441:
-

+1

> New test targets: unit and smoke
> 
>
> Key: PIG-1441
> URL: https://issues.apache.org/jira/browse/PIG-1441
> Project: Pig
>  Issue Type: Improvement
>Reporter: Olga Natkovich
>Assignee: Olga Natkovich
> Fix For: 0.8.0
>
> Attachments: PIG-1441.patch, PIG-1441_2.patch
>
>
> As we get more and more tests, adding more structure would help us to 
> minimize time spent on testing. Here are 2 new targets I propose we add. 
> (Hadoop has the same targets for the same purposes).
> unit - to run all true unit tests (those that trully testing apis and 
> internal functionality and not running e2e tests through junit. This test 
> should run relatively quick 10-15 minutes and if we are good at adding unit 
> tests will give good covergae.
> smoke - this would be a set of a few e2e tests that provide good overall 
> coverage within about 30 minutes.
> I would say that for simple patche, we would still require only commit tests 
> while for more involved patches, the developers should run both unit and 
> smoke before submitting the patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1438) [Performance] MultiQueryOptimizer should also merge DISTINCT jobs

2010-06-08 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1438:
--

Attachment: PIG-1438_1.patch

> [Performance] MultiQueryOptimizer should also merge DISTINCT jobs
> -
>
> Key: PIG-1438
> URL: https://issues.apache.org/jira/browse/PIG-1438
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.7.0
>Reporter: Richard Ding
>Assignee: Richard Ding
> Fix For: 0.8.0
>
> Attachments: PIG-1438.patch, PIG-1438_1.patch
>
>
> Current implementation doesn't merge jobs derived from DISTINCT statements. 
> The reason is that DISTINCT jobs are implemented using a special combiner 
> (DistinctCombiner). But we should be able to merge jobs that have the same 
> type of combiner (e.g. merge multiple DISTINCT jobs into one).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1438) [Performance] MultiQueryOptimizer should also merge DISTINCT jobs

2010-06-08 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1438:
--

Status: Patch Available  (was: Open)

> [Performance] MultiQueryOptimizer should also merge DISTINCT jobs
> -
>
> Key: PIG-1438
> URL: https://issues.apache.org/jira/browse/PIG-1438
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.7.0
>Reporter: Richard Ding
>Assignee: Richard Ding
> Fix For: 0.8.0
>
> Attachments: PIG-1438.patch, PIG-1438_1.patch
>
>
> Current implementation doesn't merge jobs derived from DISTINCT statements. 
> The reason is that DISTINCT jobs are implemented using a special combiner 
> (DistinctCombiner). But we should be able to merge jobs that have the same 
> type of combiner (e.g. merge multiple DISTINCT jobs into one).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1438) [Performance] MultiQueryOptimizer should also merge DISTINCT jobs

2010-06-08 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1438:
--

Status: Open  (was: Patch Available)

> [Performance] MultiQueryOptimizer should also merge DISTINCT jobs
> -
>
> Key: PIG-1438
> URL: https://issues.apache.org/jira/browse/PIG-1438
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.7.0
>Reporter: Richard Ding
>Assignee: Richard Ding
> Fix For: 0.8.0
>
> Attachments: PIG-1438.patch, PIG-1438_1.patch
>
>
> Current implementation doesn't merge jobs derived from DISTINCT statements. 
> The reason is that DISTINCT jobs are implemented using a special combiner 
> (DistinctCombiner). But we should be able to merge jobs that have the same 
> type of combiner (e.g. merge multiple DISTINCT jobs into one).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1441) New test targets: unit and smoke

2010-06-08 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876912#action_12876912
 ] 

Olga Natkovich commented on PIG-1441:
-

I updated the patch to include some join tests to the smoke for better 
coverage. I pulled a couple of tests from replicated and skew join as most 
commonly used. Note that I did not remove the tests from the original file so 
that we have good coverage when we test particular feature.

The tests now run in 23 minutes which gives us opportunity to add more in the 
future.

Please, review the patch. I would really like to commit it tomorrow. Thanks.

> New test targets: unit and smoke
> 
>
> Key: PIG-1441
> URL: https://issues.apache.org/jira/browse/PIG-1441
> Project: Pig
>  Issue Type: Improvement
>Reporter: Olga Natkovich
>Assignee: Olga Natkovich
> Fix For: 0.8.0
>
> Attachments: PIG-1441.patch, PIG-1441_2.patch
>
>
> As we get more and more tests, adding more structure would help us to 
> minimize time spent on testing. Here are 2 new targets I propose we add. 
> (Hadoop has the same targets for the same purposes).
> unit - to run all true unit tests (those that trully testing apis and 
> internal functionality and not running e2e tests through junit. This test 
> should run relatively quick 10-15 minutes and if we are good at adding unit 
> tests will give good covergae.
> smoke - this would be a set of a few e2e tests that provide good overall 
> coverage within about 30 minutes.
> I would say that for simple patche, we would still require only commit tests 
> while for more involved patches, the developers should run both unit and 
> smoke before submitting the patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1441) New test targets: unit and smoke

2010-06-08 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1441:


Attachment: PIG-1441_2.patch

> New test targets: unit and smoke
> 
>
> Key: PIG-1441
> URL: https://issues.apache.org/jira/browse/PIG-1441
> Project: Pig
>  Issue Type: Improvement
>Reporter: Olga Natkovich
>Assignee: Olga Natkovich
> Fix For: 0.8.0
>
> Attachments: PIG-1441.patch, PIG-1441_2.patch
>
>
> As we get more and more tests, adding more structure would help us to 
> minimize time spent on testing. Here are 2 new targets I propose we add. 
> (Hadoop has the same targets for the same purposes).
> unit - to run all true unit tests (those that trully testing apis and 
> internal functionality and not running e2e tests through junit. This test 
> should run relatively quick 10-15 minutes and if we are good at adding unit 
> tests will give good covergae.
> smoke - this would be a set of a few e2e tests that provide good overall 
> coverage within about 30 minutes.
> I would say that for simple patche, we would still require only commit tests 
> while for more involved patches, the developers should run both unit and 
> smoke before submitting the patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1438) [Performance] MultiQueryOptimizer should also merge DISTINCT jobs

2010-06-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876840#action_12876840
 ] 

Hadoop QA commented on PIG-1438:


+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12446604/PIG-1438.patch
  against trunk revision 952098.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/333/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/333/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/333/console

This message is automatically generated.

> [Performance] MultiQueryOptimizer should also merge DISTINCT jobs
> -
>
> Key: PIG-1438
> URL: https://issues.apache.org/jira/browse/PIG-1438
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.7.0
>Reporter: Richard Ding
>Assignee: Richard Ding
> Fix For: 0.8.0
>
> Attachments: PIG-1438.patch
>
>
> Current implementation doesn't merge jobs derived from DISTINCT statements. 
> The reason is that DISTINCT jobs are implemented using a special combiner 
> (DistinctCombiner). But we should be able to merge jobs that have the same 
> type of combiner (e.g. merge multiple DISTINCT jobs into one).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-972) Make describe work with nested foreach

2010-06-08 Thread Aniket Mokashi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876764#action_12876764
 ] 

Aniket Mokashi commented on PIG-972:


describe c and describe c::d seems more intuitive. 
Further changes-
1. Currently, we are not using a deterministic way to search for nested alias 
in internal plans. 
With changes, we will dump the schema of latest statement for d.
For example, if we have,
{code}
c = foreach b { d = order a by $0; d = filter d by d.$0 > 0; generate d.$1;}
describe c::d;
{code}
This will dump the schema for last statement associated with d (filter). This 
will be achieved by traversing the plan from leaves to root while searching for 
nested alias d.
2. nested alias list is redundant and will be removed.



> Make describe work with nested foreach
> --
>
> Key: PIG-972
> URL: https://issues.apache.org/jira/browse/PIG-972
> Project: Pig
>  Issue Type: Improvement
>Reporter: Olga Natkovich
>Assignee: Aniket Mokashi
> Fix For: 0.8.0
>
> Attachments: NestedDescribeProp1.patch, 
> NestedDescribeProp2Initial.patch
>
>
> Currently Parser can't deal with that. This is because describe is part of 
> Grunt parser while the rest of nested foreach is handled by the QueryParser

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1427) Monitor and kill runaway UDFs

2010-06-08 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876763#action_12876763
 ] 

Ashutosh Chauhan commented on PIG-1427:
---

@Dmitriy,

Occupied with some work. Will get back to it sometime later this week.  

> Monitor and kill runaway UDFs
> -
>
> Key: PIG-1427
> URL: https://issues.apache.org/jira/browse/PIG-1427
> Project: Pig
>  Issue Type: New Feature
>Affects Versions: 0.8.0
>Reporter: Dmitriy V. Ryaboy
>Assignee: Dmitriy V. Ryaboy
> Attachments: guava-r03.jar, monitoredUdf.patch, monitoredUdf.patch, 
> PIG-1427.diff
>
>
> As a safety measure, it is sometimes useful to monitor UDFs as they execute. 
> It is often preferable to return null or some other default value instead of 
> timing out a runaway evaluation and killing a job. We have in the past seen 
> complex regular expressions lead to job failures due to just half a dozen 
> (out of millions) particularly obnoxious strings.
> It would be great to give Pig users a lightweight way of enabling UDF 
> monitoring.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1409) Fix up javadocs for org.apache.pig.builtin

2010-06-08 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876755#action_12876755
 ] 

Alan Gates commented on PIG-1409:
-

Ping, can someone review this?

> Fix up javadocs for org.apache.pig.builtin
> --
>
> Key: PIG-1409
> URL: https://issues.apache.org/jira/browse/PIG-1409
> Project: Pig
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Minor
> Fix For: 0.8.0
>
> Attachments: PIG-1409.patch
>
>
> There are no external interfaces in this package to mark.  However some of 
> the javadocs can use improvement.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1438) [Performance] MultiQueryOptimizer should also merge DISTINCT jobs

2010-06-08 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1438:
--

Status: Patch Available  (was: Open)

> [Performance] MultiQueryOptimizer should also merge DISTINCT jobs
> -
>
> Key: PIG-1438
> URL: https://issues.apache.org/jira/browse/PIG-1438
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.7.0
>Reporter: Richard Ding
>Assignee: Richard Ding
> Fix For: 0.8.0
>
> Attachments: PIG-1438.patch
>
>
> Current implementation doesn't merge jobs derived from DISTINCT statements. 
> The reason is that DISTINCT jobs are implemented using a special combiner 
> (DistinctCombiner). But we should be able to merge jobs that have the same 
> type of combiner (e.g. merge multiple DISTINCT jobs into one).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1438) [Performance] MultiQueryOptimizer should also merge DISTINCT jobs

2010-06-08 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1438:
--

Attachment: PIG-1438.patch

> [Performance] MultiQueryOptimizer should also merge DISTINCT jobs
> -
>
> Key: PIG-1438
> URL: https://issues.apache.org/jira/browse/PIG-1438
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.7.0
>Reporter: Richard Ding
>Assignee: Richard Ding
> Fix For: 0.8.0
>
> Attachments: PIG-1438.patch
>
>
> Current implementation doesn't merge jobs derived from DISTINCT statements. 
> The reason is that DISTINCT jobs are implemented using a special combiner 
> (DistinctCombiner). But we should be able to merge jobs that have the same 
> type of combiner (e.g. merge multiple DISTINCT jobs into one).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1428) Add getPigStatusReporter() to PigHadoopLogger

2010-06-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876708#action_12876708
 ] 

Hadoop QA commented on PIG-1428:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12446095/PIG-1428.patch
  against trunk revision 952098.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no tests are needed for this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

-1 release audit.  The applied patch generated 383 release audit warnings 
(more than the trunk's current 382 warnings).

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/332/testReport/
Release audit warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/332/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/332/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/332/console

This message is automatically generated.

> Add getPigStatusReporter() to PigHadoopLogger
> -
>
> Key: PIG-1428
> URL: https://issues.apache.org/jira/browse/PIG-1428
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Ashutosh Chauhan
>Assignee: Dmitriy V. Ryaboy
> Fix For: 0.8.0
>
> Attachments: PIG-1428.patch, PIG-1428.patch
>
>
> Without this getter method, its not possible to get counters, report progress 
> etc. from UDFs. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1302) Include zebra's "pigtest" ant target as a part of pig's ant test target

2010-06-08 Thread Giridharan Kesavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giridharan Kesavan updated PIG-1302:


Attachment: PIG-1302.patch

This patch would add the pigtest target as part of ant test.

> Include zebra's "pigtest" ant target as a part of pig's ant test target
> ---
>
> Key: PIG-1302
> URL: https://issues.apache.org/jira/browse/PIG-1302
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.7.0
>Reporter: Pradeep Kamath
> Attachments: PIG-1302.patch
>
>
> There are changes made in Pig interfaces which break zebra loaders/storers. 
> It would be good to run the pig tests in the zebra unit tests as part of 
> running pig's core-test for each patch submission. So essentially in the 
> "test" ant target in pig, we would need to invoke zebra's "pigtest" target.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (PIG-1302) Include zebra's "pigtest" ant target as a part of pig's ant test target

2010-06-08 Thread Giridharan Kesavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giridharan Kesavan reassigned PIG-1302:
---

Assignee: Giridharan Kesavan

> Include zebra's "pigtest" ant target as a part of pig's ant test target
> ---
>
> Key: PIG-1302
> URL: https://issues.apache.org/jira/browse/PIG-1302
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.7.0
>Reporter: Pradeep Kamath
>Assignee: Giridharan Kesavan
> Attachments: PIG-1302.patch
>
>
> There are changes made in Pig interfaces which break zebra loaders/storers. 
> It would be good to run the pig tests in the zebra unit tests as part of 
> running pig's core-test for each patch submission. So essentially in the 
> "test" ant target in pig, we would need to invoke zebra's "pigtest" target.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.