Re: [sqlite] How to specify collating sequences in an expression.

Ken Thu, 01 Feb 2007 06:16:36 -0800

DRH and Sqlite Community,
 
 Provide two sql functions: toupper() and tolower() that can be applied. And 
always use binary comparison.
 
 so:
 select x from where toupper(y) = 'HELLO' ;
 would return 1 row...
 
 But here is the gotcha, more than likely applying that function would negate 
the use of an index that would be on column y. I would think that apply a 
function transformation for use on an indexed column would have pretty bad 
performance. 
 
 On the other hand something like:
 select y from where x= tolower('HELLO' ) ;
 would also return 1 row. And should have no problems utilizing an existing 
indices since the function is only applied once to the sql, not to the stored 
index data values.
 
 Also other systems such as Oracle for instance use things like NLS_COMP, 
NLS_SORT to manipulate sorting characteristics. These can be environment 
variables ora can be set as part of an "alter session" command (Similar to a 
sqlite PRAGMA).
 
 Regards,
 Ken


[EMAIL PROTECTED] wrote: SQLite has supported collating sequences since version 
3.0.0.
A collating sequence is really a specification on how comparison
operators work on strings.

You can have arbitrary collating sequences in SQLite.  But for
the sake of this discussion, lets focus on just two:  BINARY
which is the default collating sequence, and NOCASE which does
case insensitive comparisons (on US-ASCII).

Consider this example table:

   CREATE TABLE t1(
     x TEXT COLLATE BINARY,
     y TEXT COLLATE NOCASE
   );
   INSERT INTO t1 VALUES('hello','Hello');

This query returns one row because comparisons against column y
ignore case:

   SELECT * FROM t1 WHERE y='HELLO';

This query returns no rows because comparisons against column x
take case into account.

   SELECT * FROM t1 WHERE x='HELLO';

When comparing two columns, the column on the left determines which
collating sequence to use.  In the first of the following two queries,
the column on the left uses NOCASE so one row is returned.  But in
the second query, the columns are reversed and the left-most column
uses BINARY.  As a result, no rows are returned from the second
query:

   SELECT * FROM t1 WHERE y=x;
   SELECT * FROM t1 WHERE x=y;

This last point seems a little goofy, but SQLite is documented as
working that way and the situation comes up so very rarely that nobody
has yet complained.

The problem with all of the above is that the collating sequence
on a column is specified when the column is declared and cannot
be changed.  What many people would like to have is some way to 
specify a different collating sequence to override the default
collating sequence for a single comparison.  The question is, what
should the syntax be.  Here are some ideas:

    SELECT * FROM t1 WHERE x='HELLO' COLLATE NOCASE;

Here the = operator has some added syntax at the end.  There some
parsing abiguities with this approach, but they can be dealt with
just as they are with the EXCEPT clause on a LIKE operator.  The
comparison operator syntax would be like this:

       [COLLATE ]

Another idea is to use CAST:

     SELECT * FROM t1 WHERE CAST(x AS TEXT COLLATE NOCASE)='HELLO';

Or perhaps

     SELECT * FROM t1 WHERE CASE(x COLLATE NOCASE)='HELLO';

A third idea is to invent entirely new syntax, perhaps like this:

     SELECT * FROM t1 WHERE COLLATE NOCASE(x='HELLO')

Please note that while I have used the = operator in all of the
examples above, everything applies equally to !=, <, <=, >, and >=.

Questions for SQLite community members:

  (1)  Is there some standard SQL syntax that I do not know about
       for doing this kind of thing?

  (2)  How do other SQL engines do this kind of thing?

  (3)  Do you have any additional (hopefully better) ideas for
       specifying alternative collating sequences on individual
       comparison operators.

  (4)  What syntax do you prefer?

--
D. Richard Hipp  



-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Re: [sqlite] How to specify collating sequences in an expression.

Reply via email to