On 3/7/11 9:55 PM, "deepak kumar v" <deepu....@gmail.com> wrote:

>Hi Pig Developers,
>This is my first dive into open source contribution and i hope to dive
>deep.
>
>I was going through https://issues.apache.org/jira/browse/PIG-671 and
>observed the following with COUNT.java
>
>COUNT.exec() always retrieves the first item from input tuple which it
>assumes is a bag and counts the numbers of items in the bag.
>Even if we pass multiple arguments to COUNT(), it will always pick the
>first
>argument.
>
>There are few ways we go through this
>a) Leave as is cause it returns correct result for counting the number of
>items in the first argument.
>OR
>b) Make a check for the size of the input tuple in COUNT.exec() and if it
>is
>not 1 then throw ExecException()  or IllegalArgumentException {might be
>correct}
>which will cause the Map job to fail.

What about:

c) Count the number of non-null tuples in the bag (same as COUNT_STAR as
long as null tuples are not inserted somehow).  This is what users seem to
expect; I've seen several bugs due to users doing COUNT(FOO) and not
expecting it to be equivalent to COUNT(FOO.$0).

>
>Let me know how to we go about it.
>
>
>Regards,
>Deepak

Reply via email to