> I heard someone say that SUM is a binary operation. It is not, because you
> can feed it any number of values. It is not the same as "+".

While I don't believe anyone has claimed SUM represents a binary
operation, the function's behavior has been contrasted to the binary
'+' operator.  Certainly, I hope we can all agree, when SUM is fed 2
values it should arrive at the same result as the '+' operator.  The
similarity between the two is worth exploring further:

As Jay explained earlier so well, NULL + 0 = NULL expressly because
NULL is valueless, leaving the binary operation with only one value,
which results in the expression not being able to evaluate properly. 
When you can't evaluate something, this results in an absence of value
(i.e. NULL).

Similarly, the SUM of a valueless entity (NULL) should evaluate to a
valueless entity (NULL).  If we instead determine that the SUM of an
empty set or series of NULLs evaluates to 0, we are inferring value
where none exists.

For example, let's say we want to calculate a value of "light
intensity" for a pixel, which we'll determine by taking the SUM of a
pixel's RGB hex values.  Now, let's say we try calculating the
intensity of a non-existent pixel.  If SUM( NULL, NULL, NULL)
evaluates to 0, this indicates a black pixel when in reality there is
no pixel at all.  Which prompts the question, how can "nothing"
possess a color?  Would we really want to assume the absence of color
is equivalent to pitch black?

> Can somebody come up with a realistic scenario where they would
> actually want SUM() to return NULL instead of 0? 

Yes; lets say I manage a system for a company that issues credit cards
to customers.  As they use their credit line, they accrue debt to my
company, and this is reflected in the database.

Now, lets say I want my staff to be able to access the amount a
customer owes, searching by a customer ID, and returning the SUM of
the customer's debts for each card she has been issued.  If she's paid
all her balances off (like a smart spender) we'd expect the SUM should
be 0.  But let's say that she owes us $20k and the staff member enters
in an incorrect ID # when looking up the balance.  The system takes
the SUM of NULL(s) and returns 0, leading to the mistaken
interpretation that she has paid her credit line off, when in
actuality the query was erroneous and she is deep in the red.

Would you not agree that NULL value should have been returned for this
case?  How else can we discriminate between erroneous cases and actual
zero-sums?  Yes, you can argue that the error here is the result of
the staff member mistyping an ID, but it's important to recognize we
are obfuscating the matter by returning a valid-appearing result when
the system should be returning an error.  It's these cases to watch
out for where the zero-sum shortcut will be potentially dangerous.


Lastly, consider that SUM exhibits the identity property when passed a
single value.  ex:
SUM(4) = 4
SUM(0) = 0
SUM(NULL) = NULL

If we pass SUM a NULL value (i.e. no value) we should still expect it
to return what we put in (i.e. no value).  Otherwise, we're
manufacturing zero-values which, while this may often be what *think*
we want, represents a confusing twist on standard expectations that
may mask deeper problems within a system.

Reply via email to