Re: [sqlite] BETWEEN and explicit collation assignment

Roman Fleysher Thu, 22 Aug 2013 18:02:44 -0700

> create table t(x text collate nocase);
> insert into t values ('A');
> select count(*) from t where x = 'a';

> In your opinion, what result should this select statement produce? 1, as 
> it does now? 0, as Simon's approach seems to imply? Something else?

Since logically it is the operation that needs a modifier, then it should be 
supplied and not derived from the operands. If I put aside backward 
compatibility and take position of purity and clarity to the extreme (because 
taking extreme positions clarifies thinking), then this should fail since no 
rule to compare is fully specified. The "=" must be accompanied by a modifier. 
Once the modifier is supplied, comparison can proceed and index can be 
identified. 

Thus, I would use lengthy version for clarity:

create table t(x text)
create index t(x collate nocase)
create index t(x collate someothercase)
insert into t values ('A');
select count(*) from t where x = 'a' nocase;

Roman

________________________________________
From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Igor Tandetnik [i...@tandetnik.org]
Sent: Thursday, August 22, 2013 8:14 PM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] BETWEEN and explicit collation assignment

On 8/22/2013 6:59 PM, Roman Fleysher wrote:
> First, I do not expect any changes to SQLite to be made as a result of this 
> discussion.

You might not, but Simon seems to:

"It's too late to contrafit this into SQLite3, but I'm wondering whether
SQLite4 might have the collation operator rethought along these lines."

> You are right, the discussion is largely philosophical. (I do not imply 
> "philosophical" means "impractical", which are sometimes equated.) But the 
> questions you ask are also philosophical: for is there something that can be 
> done in C++ that can not be done in Fortran or Assembler?

There are lots of things that can be done much more quickly,
conveniently and reliably in C++ than in Fortran or Assembler. The
cost-benefit analysis has been performed, and the market has spoken.

But let's even set the costs aside: what are the benefits? In what way
is your proposed approach an improvement over the status quo? Sorry for
being dense here, but I honestly do not understand what advantages you
believe you are gaining. Could you explain it to me like I'm five?

> These are the questions, as I understand, Simon is asking. Can collate rule 
> be expressed in such a way that it is clear it modifies the operation not the 
> value?

This question would perhaps be interesting to explore if there were
widespread confusion on this topic. But I'm not aware of any evidence
that such confusion exists. I have followed this group for years (and so
did Simon): in my experience, the issue simply fails to arise in
practice. Or indeed in theory: I don't recall any prior discussion of
this topic, ever.

> Because COLLATE should be a modifier to comparisons, then COLLATE should be a 
> modifier to index. We agree here, I think. I do not know if it is possible to 
> build two indices on the same column using two different collate rules.

Yes it is possible. You can specify a COLLATE clause on a column in
CREATE INDEX statement. If you don't, then the collation assigned to
that column in CREATE TABLE, if any, is used. If neither place specifies
a collation, the default is BINARY.

> Assuming comparison should specify collate rule (implicitly or explicitly 
> without which comparison is impossible)...

That's precisely where the disagreement lies. Should comparison derive
its collation implicitly from its operands? I say yes, and that's
exactly the mechanism currently in place. Simon seems to say no, and
wants to introduce a different mechanism. What's your position?

Once again, a concrete example:

create table t(x text collate nocase);
insert into t values ('A');
select count(*) from t where x = 'a';

In your opinion, what result should this select statement produce? 1, as
it does now? 0, as Simon's approach seems to imply? Something else?

> Thus the collation should not be picked up from the column, it must be picked 
> up from the comparison.

Once again, see the example above. Are you willing to argue that 0 is
the correct result there? That would seem to be the logical conclusion
of your statement.

> It is possible to adopt column's collate rule, if it is unique, to be used by 
> default if no rule is explicitly specified, as you would like to see.

So, should the collation be picked up from the column, or should it not
be? You are making two contradictory statements right next to each
other. I'm confused.

> It is then clear that this adaptation is a shortcut which should raise FAIL 
> when columns with different collations are compared without explicit rule.

Yes, I've already conceded that the case where operands have conflicting
collations could be treated as an error. But that doesn't requite new
syntax or new machinery, just a minor tweak to existing rules. It
doesn't need to raise FAIL at query execution time - all the information
is available at prepare, it could be treated as a syntax error.
--
Igor Tandetnik

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] BETWEEN and explicit collation assignment

Reply via email to