> > On Sat, Apr 5, 2008 at 11:42 PM, Cory Nelson <[EMAIL PROTECTED]> wrote: > > > Sort order is highly dependent on locale. You can add custom > > > collations to do this.
> <[EMAIL PROTECTED]> wrote: > > That was not was I was talking about. I was not talking about Sort Order but > > about Searches. > > Keith On Sat, Apr 05, 2008 at 02:58:48PM -0700, Cory Nelson scratched on the wall: > They are one and the same. Look up collations. I'm a bit confused why everyone keeps pointing Keith at collations. He asked about searching and matching, and in specific the LIKE operator. I'm not trying to be dense-- I read the collation web-page several times-- but I seem to be missing something. As I understand it, collations are used to define sort orders. A collation function must define it's own version of greater-than, less-than, and equal. All three of those definitions must be transitive (given A=B, B=C, then A=C; given A<B, B<C, then A<C, etc.) or sorting doesn't work. Collations only work on literals. The LIKE operator, on the other hand, only defines needs to define "match." This is not the same as "equal," as a LIKE operand can contain wildcards that allow the same pattern to match multiple un-equal targets. In other words, LIKE operators do not need to be transitive. They also don't have any concept of greater-than or less-than, only "match" or "no match." Both of those issues mean search operators can't be used for sorting (not that you could easily define a concept of greater-than for something like "5%" in the number domain). Even in code, the two seem unrelated. The LIKE operator doesn't use collation functions, not even the built-in ones. Even the default implementations for the NOCASE collation and the default LIKE operator use different comparison mechanisms (not a surprise, as they have different goals). User defined functions are also different. To define a new collation you use sqlite3_create_collation(), while the a user-defined LIKE operator uses the vanilla sqlite3_create_function(). So unless I'm missing something, it seems like collations used for sorting and matching operators are different systems, working under different constraints (transitive or not), that produce different results (tri-state GT/LT/EQ vs boolean match/no-match), using different types of functions (collation functions vs user functions). There very well may be some more fundamental issue I'm missing, but these really don't seem "one in the same" to me. I admit there is a similarity in the fact that both system compare data at some abstract level, such as "ignore case." In both cases that requires specific understanding of the data representation, and in the case of Unicode that's often going to require locale-specific information. But that statement could be said of just about anything that uses Unicode. Needing information about the representation is just as true of ASCII as it is of Unicode, it just happens that the rules for ASCII are pretty simple. It is also true that both the LIKE operator and the built-in NOCASE collation have issues dealing with anything beyond the 26 letters used by the English language. But they are two different bugs. Fixing one won't automatically fix the other. From my understanding of the original question, I think answer to Keith's question is actually on the language-expressions page (http://www.sqlite.org/lang_expr.html), which is the same page he quoted the bug from. This page describes the LIKE, GLOB, REGEXP, and MATCH operators (all of which return a "match/no-match" value). Also see the like(), glob(), regexp(), and match() functions under the Core Functions section of the same page. If you need full Unicode support for matches and searches, it looks like your only option is to define a custom like() user function that implements the search and matching behavior you're looking for. If you need full Unicode support for sort-ordering, you also need to define a new collation. Or maybe you can find one that someone else has already written.... A copy of the message Miha Vrhovnik mentioned can be found here: http://www.mail-archive.com/sqlite-users%40sqlite.org/msg30403.html It seems the .c file mentioned in this post has been updated since the link was first posted. In addition to a new Unicode aware LIKE operator, the newer .c file also includes a new NOCASE collation function. So you can get Unicode aware searching/matching *and* sort-ordering. Perhaps that will fit your needs. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "'People who live in bamboo houses should not throw pandas.' Jesus said that." - "The Ninja", www.AskANinja.com, "Special Delivery 10: Pop!Tech 2006" _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users