I'd like to use CASE to compare an expression x against a number of
candidate values. That's the typical use for "CASE x WHEN", which
avoids repeating x for each condition.
The trouble is that one of the possible values is NULL, yet the
comparison against each candidate value is done with the = operator.
The expression "x = NULL" is meaningless since it will always evaluate
to NULL, which CASE interprets as untruth, hence "WHEN NULL THEN" will
never accomplish anything.
The workaround is to not use "CASE x WHEN" and instead use "CASE WHEN"
and repeat x every time, using = for all non-NULL values and IS for
NULL. But this means repeating x for each condition, which is the whole
reason I'd prefer "CASE x WHEN".
A compromise is to do both, as follows:
CASE x
WHEN 1 THEN 11
WHEN 2 THEN 22
WHEN 3 THEN 33
WHEN 4 THEN 44
ELSE CASE
WHEN x IS NULL THEN 55
ELSE 66
END END
Or the other way around, so that both instances of x are near each other:
CASE WHEN x IS NULL THEN 55
ELSE CASE x
WHEN 1 THEN 11
WHEN 2 THEN 22
WHEN 3 THEN 33
WHEN 4 THEN 44
ELSE 66
END END
If the CASE statement enumerates all possible values of x (whose range
is perhaps guaranteed with a CHECK or FOREIGN KEY constraint), then the
ELSE case can stand in for the NULL comparison. But otherwise, the ELSE
case will unavoidably collect not only NULL but any other unhandled values.
So I'm wondering: can we do better? I wouldn't want to risk changing
the meaning of any existing queries, but it is generally possible to
extend from the error space: take something that's currently a syntax
error and give it meaning. How about the following?
CASE x
WHEN 1 THEN 11
WHEN 2 THEN 22
WHEN 3 THEN 33
WHEN 4 THEN 44
WHEN IS NULL THEN 55
ELSE 66
END
Or equivalently, "ISNULL" instead of "IS NULL".
This treatment could also be applied to numerous other operators that
take an expression as their left-hand side and produce a truth result:
?NOT? LIKE|GLOB|REGEXP|MATCH expr ?ESCAPE expr?
NOTNULL
NOT NULL
< <= > >= != <>
= == (for completeness, even though it's implied)
?NOT? BETWEEN expr AND expr
?NOT? IN list-generation-expression
Giving us syntax such as:
CREATE TABLE fruits (name); [... veggies ... meats ...]
CASE food
WHEN IN fruits THEN 'fruit'
WHEN IN veggies THEN 'veggie'
WHEN IN meats THEN 'meat'
WHEN ISNULL THEN 'unspecified'
ELSE 'candy' END
CASE filename
WHEN GLOB '.*' THEN 'hidden'
WHEN GLOB '*.png' THEN 'image'
WHEN GLOB '*.html' THEN 'webpage'
WHEN REGEXP '\.docx?$' THEN 'MS-Word'
ELSE 'data' END
CASE age
WHEN < 1 THEN 'baby'
WHEN < 3 THEN 'toddler'
WHEN < 5 THEN 'preschooler'
WHEN < 12 THEN 'gradeschooler'
WHEN < 18 THEN 'teenager'
WHEN < 21 THEN 'young adult'
ELSE 'adult' END
CASE hour
WHEN BETWEEN 6 AND 6.5 THEN 'wake'
WHEN BETWEEN 7 AND 7.5 THEN 'breakfast'
WHEN BETWEEN 8 AND 8.5 THEN 'commute'
WHEN BETWEEN 11.5 AND 12.5 THEN 'lunch'
WHEN BETWEEN 9 AND 17 THEN 'work'
WHEN BETWEEN 17.5 AND 18 THEN 'commute'
WHEN BETWEEN 19 AND 19.5 THEN 'dinner'
WHEN BETWEEN 22 AND 24 THEN 'sleep'
WHEN BETWEEN 0 AND 6 THEN 'sleep'
END
To make the above examples more compelling, replace food, filename, age,
and hour with complex expressions such as nested queries.
The next evolution in bloat is to also support AND, OR, NOT, and
parentheses, allowing the LHS operand of any operator in a complex
expression to be omitted, defaulting to CASE's first argument. In the
last example above this would allow the two "sleep" cases (or the
"commute" cases) to be combined with OR: "WHEN BETWEEN 22 AND 24 OR
BETWEEN 0 and 6 THEN 'sleep'". But I imagine this would complicate the
parser far beyond any practical benefit.
--
Andy Goth | <andrew.m.goth/at/gmail/dot/com>
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users