Hi,
PFA patch of fix for PIG-671. Used the approach mentioned in previous email.
I could not find any test cases for Count.java, besides ant test just hung
up.

Output:
grunt> a = load 'test.txt';
grunt> x = foreach a generate COUNT(a.$0,a.$0);
grunt> dump x;
2011-03-08 14:45:03,408 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1045: Could not infer the matching function for
org.apache.pig.builtin.COUNT as multiple or none of them fit. Please use an
explicit cast.
Details at logfile:
/Users/deepakkv/Documents/opensource/pig/working/pig_1299575686422.log
grunt> b = group a all;
grunt> x = foreach b generate COUNT(a.$0,a.$0);
grunt> dump x;
2011-03-08 14:45:19,668 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1045: Could not infer the matching function for
org.apache.pig.builtin.COUNT as multiple or none of them fit. Please use an
explicit cast.
Details at logfile:
/Users/deepakkv/Documents/opensource/pig/working/pig_1299575686422.log
grunt> quit

Regards,
Deepak

On Tue, Mar 8, 2011 at 12:12 PM, deepak kumar v <[email protected]> wrote:

> Hi Dmitriy,
>
> I was looking SUBSTRING.java and thats exactly(getArgToFuncMapping) what i
> am trying now with COUNT.
> Waiting for the build to complete and test out my changes before i could
> post this option.
>
> Regards,
> Deepak
>
>
> On Tue, Mar 8, 2011 at 11:56 AM, Dmitriy Ryaboy <[email protected]>wrote:
>
>> Actually I think if you just implement getArgToFuncMapping for COUNT,
>> where you only return a mapping for a single bag argument, pig will notice
>> that the wrong number of args is supplied during the compilation phase and
>> no runtime exceptions will be required.
>>
>> I haven't checked how well the funcSpec mapping works with Bags, that's
>> something to experiment with.
>>
>> D
>>
>>
>> On Mon, Mar 7, 2011 at 9:55 PM, deepak kumar v <[email protected]>wrote:
>>
>>> Hi Pig Developers,
>>> This is my first dive into open source contribution and i hope to dive
>>> deep.
>>>
>>> I was going through https://issues.apache.org/jira/browse/PIG-671 and
>>> observed the following with COUNT.java
>>>
>>> COUNT.exec() always retrieves the first item from input tuple which it
>>> assumes is a bag and counts the numbers of items in the bag.
>>> Even if we pass multiple arguments to COUNT(), it will always pick the
>>> first
>>> argument.
>>>
>>> There are few ways we go through this
>>> a) Leave as is cause it returns correct result for counting the number of
>>> items in the first argument.
>>> OR
>>> b) Make a check for the size of the input tuple in COUNT.exec() and if it
>>> is
>>> not 1 then throw ExecException()  or IllegalArgumentException {might be
>>> correct}
>>> which will cause the Map job to fail.
>>>
>>> Let me know how to we go about it.
>>>
>>>
>>> Regards,
>>> Deepak
>>>
>>
>>
>
Index: src/org/apache/pig/builtin/COUNT.java
===================================================================
--- src/org/apache/pig/builtin/COUNT.java       (revision 1079171)
+++ src/org/apache/pig/builtin/COUNT.java       (working copy)
@@ -20,10 +20,13 @@
 import java.io.IOException;
 import java.util.Iterator;
 import java.util.Map;
+import java.util.List;
+import java.util.ArrayList;
 
 import org.apache.pig.Accumulator;
 import org.apache.pig.Algebraic;
 import org.apache.pig.EvalFunc;
+import org.apache.pig.FuncSpec;
 import org.apache.pig.PigException;
 import org.apache.pig.backend.executionengine.ExecException;
 import org.apache.pig.data.DataBag;
@@ -31,6 +34,7 @@
 import org.apache.pig.data.Tuple;
 import org.apache.pig.data.TupleFactory;
 import org.apache.pig.impl.logicalLayer.schema.Schema;
+import org.apache.pig.impl.logicalLayer.FrontendException;
 
 /**
  * Generates the count of the number of values in a bag.  This count does not
@@ -149,6 +153,18 @@
         return new Schema(new Schema.FieldSchema(null, DataType.LONG)); 
     }
     
+    /* (non-Javadoc)
+     * @see org.apache.pig.EvalFunc#getArgToFuncMapping()
+     */
+    @Override
+    public List<FuncSpec> getArgToFuncMapping() throws FrontendException {
+        List<FuncSpec> funcList = new ArrayList<FuncSpec>();
+        Schema s = new Schema();
+        s.add(new Schema.FieldSchema(null, DataType.BAG));
+        funcList.add(new FuncSpec(this.getClass().getName(), s));
+        return funcList;
+    }
+
     /* Accumulator interface implementation */
     private long intermediateCount = 0L;
 

Reply via email to