On Sep 9, 2005, at 2:59 PM, Cam Crews wrote:

I heard someone say that SUM is a binary operation. It is not, because you
can feed it any number of values. It is not the same as "+".

While I don't believe anyone has claimed SUM represents a binary
operation, the function's behavior has been contrasted to the binary
'+' operator.  Certainly, I hope we can all agree, when SUM is fed 2
values it should arrive at the same result as the '+' operator.  The
similarity between the two is worth exploring further:

As Jay explained earlier so well, NULL + 0 = NULL expressly because
NULL is valueless, leaving the binary operation with only one value,
which results in the expression not being able to evaluate properly.
When you can't evaluate something, this results in an absence of value
(i.e. NULL).

there is a potential problem here...

any set operation can be reduced eventually to a binary operation.

  SUM(4, 3, 2, 1)
= SUM(4, 3, 3)
= SUM(4, 6)
= 4 + 6
= 10

still, most would agree that

  SUM(4, NULL, 2, 1)
= 7

even though

  SUM(NULL, NULL, NULL, NULL)
= NULL

the above would be the common expectation for most folks for all aggregate operations except for COUNT whereby
  COUNT(NULL, NULL, NULL, NULL)
= 4

without any controversy.


Similarly, the SUM of a valueless entity (NULL) should evaluate to a
valueless entity (NULL).  If we instead determine that the SUM of an
empty set or series of NULLs evaluates to 0, we are inferring value
where none exists.

For example, let's say we want to calculate a value of "light
intensity" for a pixel, which we'll determine by taking the SUM of a
pixel's RGB hex values.  Now, let's say we try calculating the
intensity of a non-existent pixel.  If SUM( NULL, NULL, NULL)
evaluates to 0, this indicates a black pixel when in reality there is
no pixel at all.  Which prompts the question, how can "nothing"
possess a color?  Would we really want to assume the absence of color
is equivalent to pitch black?

actually, when you are dealing with RGB, absence of "color" is actually absence of light, which is in effect, pitch black. Now, if you were talking CMYK... still, you make a valid point.



Can somebody come up with a realistic scenario where they would
actually want SUM() to return NULL instead of 0?

Yes; lets say I manage a system for a company that issues credit cards
to customers.  As they use their credit line, they accrue debt to my
company, and this is reflected in the database.

Now, lets say I want my staff to be able to access the amount a
customer owes, searching by a customer ID, and returning the SUM of
the customer's debts for each card she has been issued.  If she's paid
all her balances off (like a smart spender) we'd expect the SUM should
be 0.  But let's say that she owes us $20k and the staff member enters
in an incorrect ID # when looking up the balance.  The system takes
the SUM of NULL(s) and returns 0, leading to the mistaken
interpretation that she has paid her credit line off, when in
actuality the query was erroneous and she is deep in the red.

Would you not agree that NULL value should have been returned for this
case?  How else can we discriminate between erroneous cases and actual
zero-sums?  Yes, you can argue that the error here is the result of
the staff member mistyping an ID, but it's important to recognize we
are obfuscating the matter by returning a valid-appearing result when
the system should be returning an error.  It's these cases to watch
out for where the zero-sum shortcut will be potentially dangerous.


well, the system should be returning an error only if someone programs it to recognize that condition as an error. The question is whether or not the following should be programmed in

SUM|AVE|MIN|MAX(of all NULL values) = NULL
SUM|AVE|MIN|MAX(of some NULL and some NON NULL values) = SUM(of all NON NULL values) COUNT(of all NULL OR some NULL and some NON NULL values) = count of all elements in the set, null or not

I say, yes, as it makes sense to me.


--
Puneet Kishor

Reply via email to