On Thu, Aug 11, 2011 at 8:57 AM, Dennis Suehr <[email protected]> wrote:
> After some digging through the sqlite3 source code, I came across the code
> for the ICU tokenizer.  After enabling that and then commenting out the
one
> line where u_foldCase() is called, i.e. icuOpen(), I retested and
> case-sensitive searching now seems to work as expected for FTS.
>
> I then tried doing the same thing in icuLikeCompare() by commenting out
> both u_foldCase() calls, i.e. for the string and the pattern and seem to
> have implemented case-sensitive LIKE searching as well for non FTS tables.
>
> Can anyone see anything wrong with this approach?  I still plan to
implement
> and register my own tokenizer, but will do it as high-lighted above.
I think that's generally the right direction to take.
> Finally, if this is a valid approach, then can I suggest that an
additional
> preprocessor macro be defined which would allow this behaviour to be
enabled
> for the general SQLite release code.
This is probably a pretty low-volume use case, and once you're
compiling anyhow you're probably better off just adding a new
tokenizer. That way you won't have an issue where someone expects a
particular tokenizer to be case-insensitive, but it's actually case
sensitive.
Better might be to parameterize the tokenizer. This could be
something like allowing this:
CREATE VIRTUAL TABLE simple USING fts3(tokenize=simple(NOCASE));
CREATE VIRTUAL TABLE simple USING fts3(tokenize=simple(BINARY));
CREATE VIRTUAL TABLE simple USING fts3(tokenize=simple); --
defaults to NOCASE
-scott


Scott,

Many thanks for the reply and the suggestion.  I reckon that is likely
to be the approach which I will take.

Regarding my suggestion for a new preprocessor macro. Firstly, I agree
that it is a low-volume use case.  However, I can suggest doing it in
a way, which should not cause any confusion, which is as follows: Make
it only available for the ICU tokenizer.  Then expect the user to
compile with a new preprocessor called something like
-DSQLITE_ENABLE_ICU_BINARY.

No big deal either way, but it does seem like a nice feature for some
users, which would be pretty painless to implement.

Thanks again,

Dennis
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to