AVG over chararrays is not a usual case, simply because it does not make
sense in most cases. For eg, what would be the average if it were a bag of
first or last names? AVG would fail if it tried to convert String to
Integer or Double.

In your case its the best to declare it int/long if you know the data type
beforehand.

Thanks,
Prashant

2012/2/15 Haitao Yao <yao.e...@gmail.com>

> I solve this problem by extending the build in AVG function to accept char
> array bag as input and calculate the result.
>
> why the build-in AVG can not accept the char array bag and convert the
> value to double and calculate the result?
>
>
>
> 在 2012-2-15,下午4:04, Jonathan Coveney 写道:
>
> > the issue is that doing (int)b.x does not cast each column to an int, but
> > rather, it tries to cast the bag itself. Short of flattening out the bag
> > and projecting it as an int, which is inefficient, I suppose you could
> make
> > a UDF that calculate the Average of chararrays by casting to an int...but
> > then that raises the question of why you couldn't just load it as an
> x:int
> > in the first place.
> >
> > So generally, you need to do something like "foreach rel generate
> (int)x".
> > In this case that doesn't work as efficiently, but this is kind of a
> weird
> > case.
> >
> > 2012/2/14 Haitao Yao <yao.e...@gmail.com>
> >
> >> hi, all
> >>       here's my pig script:
> >>
> >> A = load 'input' as (b:bag{t:(x:int, y:int)});
> >> B = foreach A generate AVG(b.x);
> >> describe B;
> >>
> >> it works well.
> >> if the b.x is char array, the problems arise:
> >> A = load 'input' as (b:bag{t:(x:chararray, y:int)});
> >> B = foreach A generate AVG((int)b.x);
> >> 2012-02-15 14:17:17,937 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> >> ERROR 1052:
> >> <line 4, column 28> Cannot cast bag with schema
> :bag{:tuple(x:chararray)}
> >> to int
> >> Details at logfile: /tmp/pig_1329286634873.log
> >>
> >> Why?  How can I calculate the avg of b.x if b.x must be a chararray?
> >>
> >>
> >> here's the running snapshot in Grunt:
> >>
> >> grunt> A = load 'input' as (b:bag{t:(x:int, y:int)});
> >> grunt> B = foreach A generate AVG(b.x);
> >> grunt> describe B;
> >> B: {double}
> >> grunt> A = load 'input' as (b:bag{t:(x:chararray, y:int)});
> >> grunt> B = foreach A generate AVG((int)b.x);
> >> 2012-02-15 14:17:17,937 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> >> ERROR 1052:
> >> <line 4, column 28> Cannot cast bag with schema
> :bag{:tuple(x:chararray)}
> >> to int
> >> Details at logfile: /tmp/pig_1329286634873.log
> >> grunt>
> >>
> >> thanks.
> >>
> >>
>
>

Reply via email to