You are correct, James, comparison depends on types. However, lets say we have suitcase and we want to test if it is bigger than the allowed limit, i.e. we have a biggest allowed suitcase to compare against. How do we answer: is this one bigger than the standard? Well, if this is a check-in bag, it has to be lighter than 50 lb, below a weight limit. If the suitcase is a carry-on it has to pass both weight limit and dimension limit. It is possible to cast the suitcase into carry-on subtype and conduct test there and to cast it to check-in type and test there too. But comparisons can be more complicated in general. If we have vectors, we may want to compare them by length (one sub-type of a vector), by one of its elements (one sub-type of vector per each element), by projection to a given vector (another sub-type), by angle that it makes with a vector (you guessed, another sub-type).
Thus, it is indeed possible to attach comparisons to types. This leads to enormous growth in number of types. Another solution is to realize that the comparator (the thing that makes comparisons) is actually an object of its own and it can be configured to apply different algorithms. Then, one does not need many subtypes: one suitcase and one vector. This is, as I understand, the approach implemented in SQLite: algorithm is passed as an argument to the comparator along with the objects. All to say something simple but not obvious: the rules of equality -- of any comparison -- are governed by the attribute(s) of the object relevant for comparison. What is relevant is not determined by the type of the object, but by the task at hand, in other words by the comparator. Each column in a table is supposed to hold an attribute of an object (of the record -- the row). But strings, stored as attributes of the record, are objects of their own. Composite objects, composed of other objects -- characters. The characters have upper and lower case, they can be printable and and non-printable, white spaces or not, bold and italic(?). Therefore, comparing strings requires specification which of the attributes of those underlying objects are relevant. Collation rule is such a specification, an algorithm that tells comparator how to compare two strings -- objects of the same type. Type alone is not enough, if we want all types to fit human heads. Roman ________________________________________ From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on behalf of James K. Lowden [jklow...@schemamania.org] Sent: Thursday, August 22, 2013 9:18 PM To: sqlite-users@sqlite.org Subject: Re: [sqlite] BETWEEN and explicit collation assignment On Thu, 22 Aug 2013 13:36:00 +0100 Simon Slavin <slav...@bigfraud.org> wrote: > > 1. where COLLATE( x AS NOCASE ) > > between COLLATE( 'a' AS NOCASE ) > > and COLLATE( 'b' AS NOCASE ) > > Again, you are trying to apply a collation to a value. And because > BETWEEN takes three operands you are being given the opportunity to > apply three collations for one condition. Which is nonsense. So the > syntax shouldn't allow it or even suggest it would work. You should > be casting the 'BETWEEN', not the operands you're supplying to it. There are two issues: 1. Is collation a property of the value, its type, or the operation? 2. Should the implied collation as currently implemented be retained, or should it be replaced? Before we can consider what syntax to use, we have to agree on what the semantics are, on what is means to compare two things. So let's dispense with item #2 until we agree on #1. Collation is not a property of the operation. There is no other kind of '=' than equality. Equal is absolute; there is no modifier for it. I understand your temptation. I've used strcasecmp(3) a few times myself. But when you realize there's no such thing in SQLite or in the relational model as an "operator modifier", you'll see the light. Collation is not a property of the value. Strings are just strings. They have an *encoding*, necessarily, but not a collation. Collation therefore must, by process of elimination, be a property of the type. And a good thing, too, because it's a property of the column, and a column is a type. Forget indexes; they're a feature of the implementation. Think logically, *semantically*: when are two things equal? Two values in a column either are or are not equal. What determines what "equal" *means*? The type! The question of whether A = B is true or false rests *not* on "how they are compared" but on *what* they are. The same is true for 'abc' = NAME To make that obvious, let's take a little mathematical excursion. Ask yourself how this is evaluated: where 1 = '1' SQL is logical (or so we like to think), and we're being asked to compare a number to a string, an apple to an orange. What to do? There are two ways to treat that: report a type error, or perform a type conversion. Currently, as it happens, SQLite does something rather surprising: sqlite> select 1 = '1', 1 * '1', 1 - '-1', 1 + '1', 1 / '1', 0 = '1'; 1 = '1' 1 * '1' 1 - '-1' 1 + '1' 1 / '1' 0 = '1' ---------- ---------- ---------- ---------- ---------- ---------- 0 1 2 2 1 0 As explained in http://www.sqlite.org/datatype3.html, "mathematical" operators have different rules from "comparison" operators. Mathematical binary operators demand two operands of the same type, and the conversion rule is that they "cast both operands to the NUMERIC storage class". No such rule exists for logical operators; afaict the comparison of strings to numbers is specified. On the evidence, no string is equal to any number, and no implicit conversion is defined. Note well: the operand didn't reach out and bang the operator on the head to modify its behavior. The operator converted the operand to a comparable type. Or not, as the case may be. All to say something simple but not obvious: the rules of equality -- of any comparision -- are governed by the type being compared. Now keep that in mind as you think about strings and their collations. when we evaluate: where 'abc' = NAME we must know: 1. The collation of NAME (not of the values in NAME) 2. The collation of 'abc' 3. The rules of conversion Dr. Hipp told us, "If X does not have a specified collation, then the collation of Y is used instead", which I would express as, "if only one of two types has the default collation, it is converted to the other type". Collations aren't "used"; they're *honored*! By that rule, if NAME has the default collation, the two operands are of the same type and can be compared. Else, if NAME's type uses a collation other than the default, 'abc' is converted to that type and the comparison is performed. That is the easiest way -- and, by the grace of Codd, the right way -- to think about it. Get your types in order, and then compare. It works for any comparison, and any type, and our friend equality remains unbothered. And unmodified. Regards, --jkl _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users