Hello all, I have a case where the user needs to perform a search in a text column of a table with many rows. Typically the user enters the first n matching characters as a search string and the application issues a SELECT statement that uses the LIKE operator with the search string:
SELECT * FROM ATable WHERE AColumn LIKE :SearchString || '%' According to the LIKE optimization this statement will use an index so it will be fast. The application is used by Greek users. The greek alphabet has some letters that are visually identical to corresponding latin letters when in capitals, (e.g. A, E, I, O, B, H, K, etc), some that are different but match deterministically to a single latin letter (e.g. Σ = S, Φ = F, Ω = O etc.) and some that don't have a deterministic match (e.g. Γ = Gh or Yi). The table contains strings that consist of words that can be written in either latin or greek characters; sometimes even mixed (the user changed input locale midword before typing the first non-common letter). I have a request that the search should match strings that are written with either latin or greek or mixed letters, e.g. "MIS" should match "MISKO" (all latin), "ΜΙΣΚΟ" (all greek) or "MIΣΚΟ" (first two letters latin, rest greek). I thought of using a custom collation that does this type of comparison, have the column use that collation and create an index on that column to speed up the search but I discovered that the LIKE operator either will not use collations other than BINARY and NOCASE (pragma case_sensitive_like) or (if overloaded to perform custom matching) will not use an index, and, worse yet, its behaviour will be the same to all string comparisons regardless of collation. So, a full table scan seems inevitable. I was wondering whether it is realistic to ask for the LIKE operator to use by default the assigned collation of a column. I assume that an index on that column is using by default the specified collation of the column for comparisons, so a LIKE clause like the aforementioned can use the index and perform a fast search while using the "mixed" comparison I need. This would transparently solve my problem and make the case_sensitive_like pragma redundant, but for backward compatibility this behaviour could be activated by a new pragma. Are there any details I am missing that prevent this from being implemented? Thanks in advance. --Constantine. _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users