On 3/7/11 9:55 PM, "deepak kumar v" <deepu....@gmail.com> wrote:
>Hi Pig Developers, >This is my first dive into open source contribution and i hope to dive >deep. > >I was going through https://issues.apache.org/jira/browse/PIG-671 and >observed the following with COUNT.java > >COUNT.exec() always retrieves the first item from input tuple which it >assumes is a bag and counts the numbers of items in the bag. >Even if we pass multiple arguments to COUNT(), it will always pick the >first >argument. > >There are few ways we go through this >a) Leave as is cause it returns correct result for counting the number of >items in the first argument. >OR >b) Make a check for the size of the input tuple in COUNT.exec() and if it >is >not 1 then throw ExecException() or IllegalArgumentException {might be >correct} >which will cause the Map job to fail. What about: c) Count the number of non-null tuples in the bag (same as COUNT_STAR as long as null tuples are not inserted somehow). This is what users seem to expect; I've seen several bugs due to users doing COUNT(FOO) and not expecting it to be equivalent to COUNT(FOO.$0). > >Let me know how to we go about it. > > >Regards, >Deepak