On Thu, Aug 11, 2011 at 8:57 AM, Dennis Suehr <[email protected]> wrote: > After some digging through the sqlite3 source code, I came across the code > for the ICU tokenizer. After enabling that and then commenting out the one > line where u_foldCase() is called, i.e. icuOpen(), I retested and > case-sensitive searching now seems to work as expected for FTS. > > I then tried doing the same thing in icuLikeCompare() by commenting out > both u_foldCase() calls, i.e. for the string and the pattern and seem to > have implemented case-sensitive LIKE searching as well for non FTS tables. > > Can anyone see anything wrong with this approach? I still plan to implement > and register my own tokenizer, but will do it as high-lighted above. I think that's generally the right direction to take. > Finally, if this is a valid approach, then can I suggest that an additional > preprocessor macro be defined which would allow this behaviour to be enabled > for the general SQLite release code. This is probably a pretty low-volume use case, and once you're compiling anyhow you're probably better off just adding a new tokenizer. That way you won't have an issue where someone expects a particular tokenizer to be case-insensitive, but it's actually case sensitive. Better might be to parameterize the tokenizer. This could be something like allowing this: CREATE VIRTUAL TABLE simple USING fts3(tokenize=simple(NOCASE)); CREATE VIRTUAL TABLE simple USING fts3(tokenize=simple(BINARY)); CREATE VIRTUAL TABLE simple USING fts3(tokenize=simple); -- defaults to NOCASE -scott
Scott, Many thanks for the reply and the suggestion. I reckon that is likely to be the approach which I will take. Regarding my suggestion for a new preprocessor macro. Firstly, I agree that it is a low-volume use case. However, I can suggest doing it in a way, which should not cause any confusion, which is as follows: Make it only available for the ICU tokenizer. Then expect the user to compile with a new preprocessor called something like -DSQLITE_ENABLE_ICU_BINARY. No big deal either way, but it does seem like a nice feature for some users, which would be pretty painless to implement. Thanks again, Dennis _______________________________________________ sqlite-users mailing list [email protected] http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

