On Thu, Aug 11, 2011 at 8:57 AM, Dennis Suehr <[email protected]> wrote:
> After some digging through the sqlite3 source code, I came across the code
> for the ICU tokenizer.  After enabling that and then commenting out the one
> line where u_foldCase() is called, i.e. icuOpen(), I retested and
> case-sensitive searching now seems to work as expected for FTS.
>
> I then tried doing the same thing in icuLikeCompare() by commenting out
> both u_foldCase() calls, i.e. for the string and the pattern and seem to
> have implemented case-sensitive LIKE searching as well for non FTS tables.
>
> Can anyone see anything wrong with this approach?  I still plan to implement
> and register my own tokenizer, but will do it as high-lighted above.

I think that's generally the right direction to take.

> Finally, if this is a valid approach, then can I suggest that an additional
> preprocessor macro be defined which would allow this behaviour to be enabled
> for the general SQLite release code.

This is probably a pretty low-volume use case, and once you're
compiling anyhow you're probably better off just adding a new
tokenizer.  That way you won't have an issue where someone expects a
particular tokenizer to be case-insensitive, but it's actually case
sensitive.

Better might be to parameterize the tokenizer.  This could be
something like allowing this:
  CREATE VIRTUAL TABLE simple USING fts3(tokenize=simple(NOCASE));
  CREATE VIRTUAL TABLE simple USING fts3(tokenize=simple(BINARY));
  CREATE VIRTUAL TABLE simple USING fts3(tokenize=simple);  --
defaults to NOCASE

-scott
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to