+1 for standard semantics. We need a COALESCE function to go along with this.
-D On Mon, Jul 6, 2009 at 10:46 AM, Olga Natkovich <ol...@yahoo-inc.com> wrote: > Hi, > > > > The current implementation of COUNT and AVG in Pig counts null values. > This is inconsistent with SQL semantics and also with semantics of other > aggregated functions such as SUM, MIN, and MAX. Originally we chose this > implementation for performance reasons; however, we re-implemented both > functions to support multi-step combiner and now the cost of checking > for null for the case where combiner is invoked is trivial. (I ran some > tests with COUNT and they showed no performance difference.) We will pay > penalty for the non-combinable case including local mode but I think it > is worth the price to have consistent semantics. Also as we are working > on SQL support, having SQL compliant semantics becomes very desirable. > > > > Please, let us know if you have any concerns. I am planning to make the > change later this week. > > > > Olga > >