Nice work.

You are going to want to make sure COUNT also works on the scenarios it's
supposed to work on. So far you only seem to be testing failures?

Also, write it up as proper unit tests so we don't get regressions.

D

On Tue, Mar 8, 2011 at 10:40 AM, deepak kumar v <deepu....@gmail.com> wrote:

> Hi Dmitriy,
> Will checkout TestBuiltins.java once my eclipse setup is ready.
> Meanwhile i tried the couple of scenarios that you mentioned.
>
> 1) Schema defined for a
> grunt> a = load 'test.txt' as (data:chararray);
> grunt> b = group a all;
> grunt> describe a;
> a: {data: chararray}
> grunt> describe b;
> b: {group: chararray,a: {(data: chararray)}}
> grunt> x = foreach b generate COUNT(a.data, a.data);
> grunt> dump x;
> 2011-03-09 00:06:40,953 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1045: Could not infer the matching function for
> org.apache.pig.builtin.COUNT as multiple or none of them fit. Please use an
> explicit cast.
>
> 2) Schema not defined for a
> grunt> a = load 'test.txt';
> grunt> b = group a all;
> grunt> describe a;
> Schema for a unknown.
> grunt> describe b;
> b: {group: chararray,a: {(null)}}
> grunt> x = foreach b generate COUNT(a.$0, a.$0);
> grunt> dump x;
> 2011-03-09 00:07:58,715 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1045: Could not infer the matching function for
> org.apache.pig.builtin.COUNT as multiple or none of them fit. Please use an
> explicit cast.
>
>
> Changes seems to be working with both scenarios.
>
> Regards,
> Deepak
>
>
>
> On Tue, Mar 8, 2011 at 10:45 PM, Dmitriy Ryaboy <dvrya...@gmail.com>wrote:
>
>> ant test doesn't hang, it just runs for a very long time. If you want to
>> test something specific, you can name the test class like so:
>>
>> ant test -Dtestcase=TestBuiltins
>> (this will run the tests in TestBuiltins.java)
>>
>> COUNT tests are probably in TestBuiltins or in TestAlgebraic. Look around.
>>
>> You definitely want to add some tests to make sure that COUNT still works
>> on the cases where it's supposed to work, and that the Pig parser no longer
>> allows COUNT with the wrong number or type of arguments.
>>
>> I would test in particular what happens when a bag is supplied for which a
>> schema is known -- Pig might be making a distinction between a bag with a
>> known schema and a bag with an unknown schema, and we definitely want both
>> of those to work.
>>
>> D
>>
>>
>> On Tue, Mar 8, 2011 at 1:58 AM, deepak kumar v <deepu....@gmail.com>wrote:
>>
>>> Hi,
>>> PFA patch of fix for PIG-671. Used the approach mentioned in previous
>>> email.
>>> I could not find any test cases for Count.java, besides ant test just
>>> hung up.
>>>
>>> Output:
>>> grunt> a = load 'test.txt';
>>> grunt> x = foreach a generate COUNT(a.$0,a.$0);
>>> grunt> dump x;
>>> 2011-03-08 14:45:03,408 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>>> ERROR 1045: Could not infer the matching function for
>>> org.apache.pig.builtin.COUNT as multiple or none of them fit. Please use an
>>> explicit cast.
>>> Details at logfile:
>>> /Users/deepakkv/Documents/opensource/pig/working/pig_1299575686422.log
>>> grunt> b = group a all;
>>> grunt> x = foreach b generate COUNT(a.$0,a.$0);
>>> grunt> dump x;
>>> 2011-03-08 14:45:19,668 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>>> ERROR 1045: Could not infer the matching function for
>>> org.apache.pig.builtin.COUNT as multiple or none of them fit. Please use an
>>> explicit cast.
>>> Details at logfile:
>>> /Users/deepakkv/Documents/opensource/pig/working/pig_1299575686422.log
>>> grunt> quit
>>>
>>> Regards,
>>> Deepak
>>>
>>> On Tue, Mar 8, 2011 at 12:12 PM, deepak kumar v <deepu....@gmail.com>wrote:
>>>
>>>> Hi Dmitriy,
>>>>
>>>> I was looking SUBSTRING.java and thats exactly(getArgToFuncMapping) what
>>>> i am trying now with COUNT.
>>>> Waiting for the build to complete and test out my changes before i could
>>>> post this option.
>>>>
>>>> Regards,
>>>> Deepak
>>>>
>>>>
>>>> On Tue, Mar 8, 2011 at 11:56 AM, Dmitriy Ryaboy <dvrya...@gmail.com>wrote:
>>>>
>>>>> Actually I think if you just implement getArgToFuncMapping for COUNT,
>>>>> where you only return a mapping for a single bag argument, pig will notice
>>>>> that the wrong number of args is supplied during the compilation phase and
>>>>> no runtime exceptions will be required.
>>>>>
>>>>> I haven't checked how well the funcSpec mapping works with Bags, that's
>>>>> something to experiment with.
>>>>>
>>>>> D
>>>>>
>>>>>
>>>>> On Mon, Mar 7, 2011 at 9:55 PM, deepak kumar v <deepu....@gmail.com>wrote:
>>>>>
>>>>>> Hi Pig Developers,
>>>>>> This is my first dive into open source contribution and i hope to dive
>>>>>> deep.
>>>>>>
>>>>>> I was going through https://issues.apache.org/jira/browse/PIG-671 and
>>>>>> observed the following with COUNT.java
>>>>>>
>>>>>> COUNT.exec() always retrieves the first item from input tuple which it
>>>>>> assumes is a bag and counts the numbers of items in the bag.
>>>>>> Even if we pass multiple arguments to COUNT(), it will always pick the
>>>>>> first
>>>>>> argument.
>>>>>>
>>>>>> There are few ways we go through this
>>>>>> a) Leave as is cause it returns correct result for counting the number
>>>>>> of
>>>>>> items in the first argument.
>>>>>> OR
>>>>>> b) Make a check for the size of the input tuple in COUNT.exec() and if
>>>>>> it is
>>>>>> not 1 then throw ExecException()  or IllegalArgumentException {might
>>>>>> be
>>>>>> correct}
>>>>>> which will cause the Map job to fail.
>>>>>>
>>>>>> Let me know how to we go about it.
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Deepak
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to