This is definitely a bug in sqlite. I have experienced it too.

The problem stems from “getNextToken(…)” expecting to find the parentheses in 
the token delimiters (rather than the tokens themselves). The ICU tokenizer 
returns the parentheses as tokens, rather than ignoring them as delimiters as 
the simple tokenizer does.

Two possible fixes:
1. Fix getNextToken(...) to look in tokens as well as delimiters for parentheses
2. Fix icuNext to not return parentheses as tokens.

To me, option 1. seemed easier to do a quick hack to, until there is an 
official fix.

In getNextToken, I changed: 
                        if (rc == SQLITE_DONE) iStart = n;
                        for (i = 0; i < iStart i++) { 
                                if (z[i] == '(') {

to:

                        if (rc == SQLITE_DONE) iStart = n;
                        for (i = 0; i < iEnd; i++) { // 2014-04-12 DCRH: Tweak 
to make parens work with ICU tokenizer
                                if (z[i] == '(') {

That way, it now searches the token text in addition to the preceding 
delimiters, and parentheses now work correctly with the ICU tokenizer.

Hope this helps,

David
-- 
David Hedley
CTO
Vistair Systems Ltd
Mobile: +44 (0)7971 681088
Tex: 0845 VISTAIR (8478247) / +44 1454 616531
Fax: 0870 1350992
-- 
Information in this electronic mail message is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this message by 
anyone else is unauthorised. If you are not the intended recipient any use, 
disclosure, copying or distribution of this message is prohibited and may be 
unlawful. When addressed to our customers, any information contained in this 
message is subject to Vistair Systems Ltd Terms and Conditions.

Vistair Systems Ltd is registered in England and Wales #5418081



_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to