As for the PRODUCT, I don't see why it could not be added to builtin. It is a very generic and dependency less function.
On Fri, May 3, 2013 at 1:36 PM, Sergey Goder <sergeygo...@gmail.com> wrote: > Thanks for the tip about numerical accuracy issues and the elegant solution > exploiting log/exp. It is very much appreciated. > > Sergey > > > On Fri, May 3, 2013 at 11:42 AM, Kai Londenberg < > kai.londenb...@googlemail.com> wrote: > > > Hi, > > > > Just a hint: It's usually better to work with log probabilites and sum > > over them, than to work with raw probabilities and to use > > multiplication. You might easily run into numerical accuracy issues > > otherwise. > > > > i.e. exploit this fact: > > > > product(x1, ..., xn) = exp(sum(log(x1), ..., log(xn))) > > > > best, > > > > Kai Londenberg > > > > 2013/5/3 Sergey Goder <sergeygo...@gmail.com>: > > > I'm creating a multinomial naive bayes classifier using pig and need to > > > compute the product of probabilities. There are an arbitrary number of > > > values in the bag so I would like to be able to use a function similar > to > > > the builtin SUM to do this. I looked through the source code and found > > that > > > with some really simple changes to SUM.java I can create a PROD.java > > > function. I included it in my piggybank and have been using it > > successfully. > > > > > > I was curious what the community thought about including this function > > as a > > > builtin function in a future release? Or would it make more sense to > keep > > > this function as a udf in a piggybank. > > > > > > Thanks, > > > Sergey > > >