On 8/22/2013 4:06 PM, Roman Fleysher wrote:
Logically, I agree with Simon, collate modifier in table definition describes 
how indices should be built, nothing more.

But for what purpose are indexes built? Isn't it for the purpose of being used to speed up queries? Wouldn't you expect a query like

select count(*) from t where x = 'a';

to actually use an index on t(x), when available? And if you do, doesn't that mean that the comparison (x = 'a') should use the same collation as an index on t(x), that is, the collation associated with column x? And if so, then you are right back to the idea that the collation used by the comparison operator should, at least sometimes, be inferred from the collation associated with its operands. Therefore, the COLLATE clause in the table definition should apply not just to building indexes, but also to the behavior of comparisons. QED.

Where is the flaw in this chain of reasoning, in your opinion?

Comparisons, as in these examples should specify which collation is to be used 
and if it is different from one used for indexing, then yes, Igor, index can 
not be used. I believe this is the present behavior already.

Obviously, if you explicitly specify a collation for a comparison that doesn't match that of an index, then the index cannot be used. The question is, if you do *not* explicitly specify a collation for a comparison, as in example above - would you expect an index to be used? Personally, I would, and I rather like the fact that SQLite currently does use an index in this case. Simon, on the other hand, appears to be saying that he would rather the index not be used, that the comparison be done using BINARY collation (though he never answered my direct question about it).

I do not think Simon asks to change the way collations work. He is asking to make clear 
how they work. I think Simon is asking for clarity so that error like "(x collate A) 
= (y collate B)" can not take place.

What to do with the case of (x = y) where x and y are two columns with different collations, then? How do you ensure that it cannot take place? And if you can't prevent it, then you need to do *something* in this case, and then you can do the exact same thing with "(x collate A) = (y collate B)" case - no need to invent new syntax or machinery to try and avoid it.

This is similar to JOIN operation, which requires two tables to have common 
column.

JOIN operation requires no such thing. I don't understand this analogy.

Using this analogy, I would write:

select * from t WHERE x = y USING NOCASE

How is this different from

select * from t WHERE x = y COLLATE NOCASE

I only see a superficial difference in the keyword used. What's the improvement?


I *do* understand what you and Simon are suggesting. I *do* understand that you want some kind of syntax that would hang a collation onto the operator, rather than its operands, for certain philosophical and/or aesthetic reasons. My point is, this would have to be done *in addition to*, not *instead of*, the existing mechanism (because of the whole "comparison must automatically pick the collation off the column in order to use the index" thing). So it's strictly an added complexity.

The burden is then on the authors of the proposal to demonstrate the benefits of the feature you are proposing, and to show that said benefits do in fact outweigh the cost of this added complexity. So I would ask you the same question I asked Simon: what exactly are the benefits of your approach? Is there something that can be done your way but can't be done the current way? Is there some bad outcome that's possible the current way but prevented your way? Do you just prefer your (as yet unspecified) syntax on purely aesthetic grounds?
--
Igor Tandetnik

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to