> I heard someone say that SUM is a binary operation. It is not, because you > can feed it any number of values. It is not the same as "+".
While I don't believe anyone has claimed SUM represents a binary operation, the function's behavior has been contrasted to the binary '+' operator. Certainly, I hope we can all agree, when SUM is fed 2 values it should arrive at the same result as the '+' operator. The similarity between the two is worth exploring further: As Jay explained earlier so well, NULL + 0 = NULL expressly because NULL is valueless, leaving the binary operation with only one value, which results in the expression not being able to evaluate properly. When you can't evaluate something, this results in an absence of value (i.e. NULL). Similarly, the SUM of a valueless entity (NULL) should evaluate to a valueless entity (NULL). If we instead determine that the SUM of an empty set or series of NULLs evaluates to 0, we are inferring value where none exists. For example, let's say we want to calculate a value of "light intensity" for a pixel, which we'll determine by taking the SUM of a pixel's RGB hex values. Now, let's say we try calculating the intensity of a non-existent pixel. If SUM( NULL, NULL, NULL) evaluates to 0, this indicates a black pixel when in reality there is no pixel at all. Which prompts the question, how can "nothing" possess a color? Would we really want to assume the absence of color is equivalent to pitch black? > Can somebody come up with a realistic scenario where they would > actually want SUM() to return NULL instead of 0? Yes; lets say I manage a system for a company that issues credit cards to customers. As they use their credit line, they accrue debt to my company, and this is reflected in the database. Now, lets say I want my staff to be able to access the amount a customer owes, searching by a customer ID, and returning the SUM of the customer's debts for each card she has been issued. If she's paid all her balances off (like a smart spender) we'd expect the SUM should be 0. But let's say that she owes us $20k and the staff member enters in an incorrect ID # when looking up the balance. The system takes the SUM of NULL(s) and returns 0, leading to the mistaken interpretation that she has paid her credit line off, when in actuality the query was erroneous and she is deep in the red. Would you not agree that NULL value should have been returned for this case? How else can we discriminate between erroneous cases and actual zero-sums? Yes, you can argue that the error here is the result of the staff member mistyping an ID, but it's important to recognize we are obfuscating the matter by returning a valid-appearing result when the system should be returning an error. It's these cases to watch out for where the zero-sum shortcut will be potentially dangerous. Lastly, consider that SUM exhibits the identity property when passed a single value. ex: SUM(4) = 4 SUM(0) = 0 SUM(NULL) = NULL If we pass SUM a NULL value (i.e. no value) we should still expect it to return what we put in (i.e. no value). Otherwise, we're manufacturing zero-values which, while this may often be what *think* we want, represents a confusing twist on standard expectations that may mask deeper problems within a system.