On 2015-07-23 04:56 PM, Igor Tandetnik wrote:
> On 7/23/2015 10:47 AM, Bernardo Sulzbach wrote:
>>> select * from tbl1 where substr(col1,1,1) in ('A','a') and col1 like
>>> 'a_c%';
>>
>> Just adding to Igor's answer:
>> col1 between 'a' and 'b' or col1 between 'A' and 'B'
>
> That's not quite the same. BETWEEN is inclusive of both ends. ('b'
> BETWEEN 'a' and 'b') is true.
To add:
I think someone suggested before to do things like BETWEEN 'a' AND
'azzzzzzzzzzzz' or similar because 'a' < 'aa' so alphabetical this
should be correct. Of course if you wish to catch all words starting
with 'a' the best is to supply the highest standard-printable character
after the a, which in the normal ASCII set should be ~ so that BETWEEN
'a' AND 'a~' should catch all normal ASCII strings, and then the
extended set goes to 254 (which is hard to type on a keyboard) but you
can do BETWEEN 'a' AND 'a'||char(254) etc.
ref: http://www.asciitable.com/ [1]
Of course once you get to UNICODE strings, the highest valid character
that can follow 'a' becomes more complicated and encoding-dependent.
Therefore it is (to me) always the best to do: LIKE 'a%' - which the
SQLite Query Planner knows to optimize and use against an index - plus
keep using ci type collations (NOCASE in SQLite).
"WHERE substring(X,1,Y)" could similarly be optimized I guess, though
that would fall in a category of use-cases that should be exceedingly
rare (but if easy to do, can't be harmful to add).
Cheers,
Ryan
[1] For fun, notice how the characters [, /, ], ^ and _ follows after
the capitals but before the lowercase letters, which ends up in some
unexpected sorting cases where the following are all true:
'A' < '_A' but 'a' > '_a'
or put another way:
'JohnAnderson' < 'John_Anderson' - but - lowercase('JohnAnderson') >
lowercase('John_Anderson')
whereas...
'JohnAnderson' > 'John Anderson' - and - lowercase('JohnAnderson') >
lowercase('John Anderson')
also...
'{A}' > '[b]' > '(z)'
'{A{B}' < '{A|B}' but '{A}B}' > '{A|B}'