Hi,

On 03/19/2016 07:34 AM, David Fetter wrote:
On Fri, Mar 18, 2016 at 06:12:12PM -0700, Jeff Janes wrote:
On Tue, Mar 15, 2016 at 8:36 AM, David Fetter <da...@fetter.org> wrote:

Please find attached a patch that uses the float8 version to cover the
numeric types.

Is there a well-defined meaning for having a negative weight?  If no,
should it be disallowed?

Opinions on this appear to vary.  A Wikipedia article defines weights
as non-negative, while a manual to which it refers only uses non-zero.

https://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Mathematical_definition
https://www.gnu.org/software/gsl/manual/html_node/Weighted-Samples.html

I don't think that actually allows negative weights. It says that

   w_i = 1/\sigma_i^2

and variance is always > 0, thus w_i > 0. The zero is used as a special flag to remove the sample from the data set in a simple way.

I'm not sure which if either would be authoritative, but I could
certainly make up variants for each assumption.

The assumption they have in common about weights is that a zero
weight is not part of the calculation, which assumption is
implemented in the previously submitted code.

I think that if we're not sure what should happen with negative weights, then we should disallow them. It's easy to allow them later once we have a reasonable definition, but if we allow them now and later realize it should behave differently, we'll be in trouble because of breaking existing uses.

I can't really come up with a reasonable example that would actually use negative weights. Can you? That would probably help with defining the behavior correctly.

Allowing negative weights has other consequences. For example, what if sum(W) ends up being 0? For example

CREATE TABLE t (a float, b float);
INSERT INTO  t SELECT i,  1 FROM generate_series(1,1000) s(i);
INSERT INTO  t SELECT i, -1 FROM generate_series(1,1000) s(i);

SELECT weighted_avg(a,b) FROM t;
 weighted_avg
--------------
          NaN
(1 row)

Is that the correct behavior? Why?

So -1 to allowing negative weights, unless we can come up with proper definition or at least good examples demonstrating the usefulness.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to