On 8/22/2013 4:06 PM, Roman Fleysher wrote:
Logically, I agree with Simon, collate modifier in table definition describes
how indices should be built, nothing more.
But for what purpose are indexes built? Isn't it for the purpose of
being used to speed up queries? Wouldn't you expect a query like
select count(*) from t where x = 'a';
to actually use an index on t(x), when available? And if you do, doesn't
that mean that the comparison (x = 'a') should use the same collation as
an index on t(x), that is, the collation associated with column x? And
if so, then you are right back to the idea that the collation used by
the comparison operator should, at least sometimes, be inferred from the
collation associated with its operands. Therefore, the COLLATE clause in
the table definition should apply not just to building indexes, but also
to the behavior of comparisons. QED.
Where is the flaw in this chain of reasoning, in your opinion?
Comparisons, as in these examples should specify which collation is to be used
and if it is different from one used for indexing, then yes, Igor, index can
not be used. I believe this is the present behavior already.
Obviously, if you explicitly specify a collation for a comparison that
doesn't match that of an index, then the index cannot be used. The
question is, if you do *not* explicitly specify a collation for a
comparison, as in example above - would you expect an index to be used?
Personally, I would, and I rather like the fact that SQLite currently
does use an index in this case. Simon, on the other hand, appears to be
saying that he would rather the index not be used, that the comparison
be done using BINARY collation (though he never answered my direct
question about it).
I do not think Simon asks to change the way collations work. He is asking to make clear
how they work. I think Simon is asking for clarity so that error like "(x collate A)
= (y collate B)" can not take place.
What to do with the case of (x = y) where x and y are two columns with
different collations, then? How do you ensure that it cannot take place?
And if you can't prevent it, then you need to do *something* in this
case, and then you can do the exact same thing with "(x collate A) = (y
collate B)" case - no need to invent new syntax or machinery to try and
avoid it.
This is similar to JOIN operation, which requires two tables to have common
column.
JOIN operation requires no such thing. I don't understand this analogy.
Using this analogy, I would write:
select * from t WHERE x = y USING NOCASE
How is this different from
select * from t WHERE x = y COLLATE NOCASE
I only see a superficial difference in the keyword used. What's the
improvement?
I *do* understand what you and Simon are suggesting. I *do* understand
that you want some kind of syntax that would hang a collation onto the
operator, rather than its operands, for certain philosophical and/or
aesthetic reasons. My point is, this would have to be done *in addition
to*, not *instead of*, the existing mechanism (because of the whole
"comparison must automatically pick the collation off the column in
order to use the index" thing). So it's strictly an added complexity.
The burden is then on the authors of the proposal to demonstrate the
benefits of the feature you are proposing, and to show that said
benefits do in fact outweigh the cost of this added complexity. So I
would ask you the same question I asked Simon: what exactly are the
benefits of your approach? Is there something that can be done your way
but can't be done the current way? Is there some bad outcome that's
possible the current way but prevented your way? Do you just prefer your
(as yet unspecified) syntax on purely aesthetic grounds?
--
Igor Tandetnik
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users