COUNT, AVG and nulls

2009-07-06 Thread Olga Natkovich
Hi, The current implementation of COUNT and AVG in Pig counts null values. This is inconsistent with SQL semantics and also with semantics of other aggregated functions such as SUM, MIN, and MAX. Originally we chose this implementation for performance reasons; however, we re-implemented both

Re: COUNT, AVG and nulls

2009-07-06 Thread Dmitriy Ryaboy
+1 for standard semantics. We need a COALESCE function to go along with this. -D On Mon, Jul 6, 2009 at 10:46 AM, Olga Natkovich ol...@yahoo-inc.com wrote: Hi, The current implementation of COUNT and AVG in Pig counts null values. This is inconsistent with SQL semantics and also with

[jira] Commented: (PIG-872) use distributed cache for the replicated data set in FR join

2009-07-06 Thread Pradeep Kamath (JIRA)
[ https://issues.apache.org/jira/browse/PIG-872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12727648#action_12727648 ] Pradeep Kamath commented on PIG-872: Distributed cache can be used for the case where the

Re: COUNT, AVG and nulls

2009-07-06 Thread Yiping Han
+1. --Yiping On 7/6/09 10:58 AM, Dmitriy Ryaboy dvrya...@cloudera.com wrote: +1 for standard semantics. We need a COALESCE function to go along with this. -D On Mon, Jul 6, 2009 at 10:46 AM, Olga Natkovich ol...@yahoo-inc.com wrote: Hi, The current implementation of COUNT

[jira] Commented: (PIG-872) use distributed cache for the replicated data set in FR join

2009-07-06 Thread Milind Bhandarkar (JIRA)
[ https://issues.apache.org/jira/browse/PIG-872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12727660#action_12727660 ] Milind Bhandarkar commented on PIG-872: --- A couple of things: As Pradeep says, only the

[jira] Updated: (PIG-820) PERFORMANCE: The RandomSampleLoader should be changed to allow it subsume another loader

2009-07-06 Thread Ashutosh Chauhan (JIRA)
[ https://issues.apache.org/jira/browse/PIG-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-820: - Attachment: pig-820_v7.patch Submitting the patch for review. Currently running tests. Will update

[jira] Commented: (PIG-820) PERFORMANCE: The RandomSampleLoader should be changed to allow it subsume another loader

2009-07-06 Thread Pradeep Kamath (JIRA)
[ https://issues.apache.org/jira/browse/PIG-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12727834#action_12727834 ] Pradeep Kamath commented on PIG-820: Review comments - two observations: 1. In PigStorage

[jira] Updated: (PIG-820) PERFORMANCE: The RandomSampleLoader should be changed to allow it subsume another loader

2009-07-06 Thread Ashutosh Chauhan (JIRA)
[ https://issues.apache.org/jira/browse/PIG-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-820: - Attachment: pig-820_v8.patch Thanks Pradeep for the review. skip(1) is not required because reading

[jira] Updated: (PIG-820) PERFORMANCE: The RandomSampleLoader should be changed to allow it subsume another loader

2009-07-06 Thread Ashutosh Chauhan (JIRA)
[ https://issues.apache.org/jira/browse/PIG-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-820: - Status: Patch Available (was: Reopened) PERFORMANCE: The RandomSampleLoader should be changed to

Build failed in Hudson: Pig-Patch-minerva.apache.org #114

2009-07-06 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/114/changes Changes: [daijy] PIG-861: POJoinPackage lose tuple in large dataset -- [...truncated 95196 lines...] [exec] [junit] 09/07/07 04:28:59 INFO

[jira] Commented: (PIG-820) PERFORMANCE: The RandomSampleLoader should be changed to allow it subsume another loader

2009-07-06 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/PIG-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12727907#action_12727907 ] Hadoop QA commented on PIG-820: --- -1 overall. Here are the results of testing the latest