Re: [sqlite] Searching with like for a specific start letter
A practical situation? Lexicographical applications and full-text applications against text corpora require indexed substring searches, including ends-with searches. (The FTS extension is not always a good fit.) I am glad that only the LIKE operator has been overridden in Adobe's version and in the version that ships with the System.Data.SQLite (.NET) adapter; I'd be up the creek if both LIKE and GLOB had been overridden. I like your renaming suggestion but unfortunately that's not an option if the implementors want to make their implementation widely available and support standard syntax. Hence, Adobe and Google et al don't have a LIKEU(). Tim Romano On Wed, Apr 28, 2010 at 10:09 AM, Jean-Christophe Deschampswrote: > Tim, > > > I agree it is possible to overload LIKE and GLOB independantly but I > don't see a practical situation where overloading only one of them > would be desirable. > > For instance, if some extension overloads LIKE to support ICU, it would > be logical and consistent to overload GLOB with the same > function. Given that the two entries differ only by a parameter, > enjoying Unicode support in LIKE and not in GLOB (or vice-versa) would > be a bit strange. > > Should one have a need to keep the native functions untouched, there is > the easy possibility to call the new versions with new names (e.g. > LIKEU, GLOBU) even if that makes the SQL less standard. > > In short: possible yes, likely not much. > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Searching with like for a specific start letter
Tim, >But did I say that GLOB uses an index if it has been overloaded? No. I >wrote that if LIKE has been overloaded, queries that contain LIKE >won't use >the index. Typically, GLOB won't have been overridden too just >because LIKE >has been overridden: the rationale for overriding the LIKE operator >does not >apply equally to GLOB, and it would make little sense to override GLOB >in a >manner that vitiates its raison d'être. You are conflating these two >functions ("... if LIKE/GLOB has been overridden... overloads LIKE/GLOB") >but in important respects they are dissimilar. I agree it is possible to overload LIKE and GLOB independantly but I don't see a practical situation where overloading only one of them would be desirable. For instance, if some extension overloads LIKE to support ICU, it would be logical and consistent to overload GLOB with the same function. Given that the two entries differ only by a parameter, enjoying Unicode support in LIKE and not in GLOB (or vice-versa) would be a bit strange. Should one have a need to keep the native functions untouched, there is the easy possibility to call the new versions with new names (e.g. LIKEU, GLOBU) even if that makes the SQL less standard. In short: possible yes, likely not much. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Searching with like for a specific start letter
Jean-Christophe, But did I say that GLOB uses an index if it has been overloaded? No. I wrote that if LIKE has been overloaded, queries that contain LIKE won't use the index. Typically, GLOB won't have been overridden too just because LIKE has been overridden: the rationale for overriding the LIKE operator does not apply equally to GLOB, and it would make little sense to override GLOB in a manner that vitiates its raison d'être. You are conflating these two functions ("... if LIKE/GLOB has been overridden... overloads LIKE/GLOB") but in important respects they are dissimilar. Regards Tim Romano On Mon, Apr 26, 2010 at 8:27 PM, Jean-Christophe Deschampswrote: > Tim, > > >Queries using GLOB do use the index on the column in question (i.e. > >optimization is attempted) > >Queries using LIKE do not use that index if the LIKE operator has been > >overridden. > > Sorry but GLOB doesn't use an index either if LIKE/GLOB has been > overloaded. This is consistent with the docs and the output of Explain > query plan for both variants when an extension is active and overloads > LIKE/GLOB. > > Things can be different with a custom built of SQLite, where native > LIKE/GLOB itself has been modified. With custom code, all bets are off. > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Searching with like for a specific start letter
Tim, >Queries using GLOB do use the index on the column in question (i.e. >optimization is attempted) >Queries using LIKE do not use that index if the LIKE operator has been >overridden. Sorry but GLOB doesn't use an index either if LIKE/GLOB has been overloaded. This is consistent with the docs and the output of Explain query plan for both variants when an extension is active and overloads LIKE/GLOB. Things can be different with a custom built of SQLite, where native LIKE/GLOB itself has been modified. With custom code, all bets are off. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Searching with like for a specific start letter
Just to complete the thread, I decided for the following: SELECT substr(normalized,1,1) AS letter, COUNT(*) from entry group by letter order by letter; Thank you ALL! 2010/4/26 Alberto Simões: > Hello > > Thank you all for the answers. > > On Mon, Apr 26, 2010 at 12:59 PM, Black, Michael (IS) > wrote: >> When you say "running on the fly" do you mean running from an sqlite3 >> command prompt? > > I mean somebody will query it and will be waiting for the answer. > >> Or are you doing this in some other programming language? > > Perl > >> Why in the world would you use a database to do this? > > Probably I am not using a database for this, but for something else, > and I want to add a feature to let users read some statistics. > >> >> >> Hello >> >> I am running on the fly a query to count the number of words starting >> with one of the 26 letters. >> >> I am doing the usual SELECT COUNT(term) from dictionary WHERE normword >> LIKE "a%" (for the 26 letters) >> >> normword is the term normalized without accents and the like >> >> >> Is there any way to make this query faster? It is taking about 10 >> second for 140K entries. >> >> One idea is to add a column named 'letter' and SELECT COUNT(letter) >> from dictionary WHERE letter = 'a'. >> But are there other solutions? >> >> Thanks >> -- >> Alberto Simões >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> >> >> >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> >> > > > > -- > Alberto Simões > -- Alberto Simões ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Searching with like for a specific start letter
Hello Thank you all for the answers. On Mon, Apr 26, 2010 at 12:59 PM, Black, Michael (IS)wrote: > When you say "running on the fly" do you mean running from an sqlite3 command > prompt? I mean somebody will query it and will be waiting for the answer. > Or are you doing this in some other programming language? Perl > Why in the world would you use a database to do this? Probably I am not using a database for this, but for something else, and I want to add a feature to let users read some statistics. > > > Hello > > I am running on the fly a query to count the number of words starting > with one of the 26 letters. > > I am doing the usual SELECT COUNT(term) from dictionary WHERE normword > LIKE "a%" (for the 26 letters) > > normword is the term normalized without accents and the like > > > Is there any way to make this query faster? It is taking about 10 > second for 140K entries. > > One idea is to add a column named 'letter' and SELECT COUNT(letter) > from dictionary WHERE letter = 'a'. > But are there other solutions? > > Thanks > -- > Alberto Simões > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > -- Alberto Simões ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Searching with like for a specific start letter
Edit: I meant to type "Firefox" not Firebird. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Searching with like for a specific start letter
I am not quite sure what it is, and why it is, that you are doubting, Jean-Christophe. Queries using GLOB do use the index on the column in question (i.e. optimization is attempted) Queries using LIKE do not use that index if the LIKE operator has been overridden. You could confirm this claim from the docs and/or by using EXPLAIN QUERY PLAN, assuming you had access to a version of SQLite where the LIKE operator has been overridden (as it has been in Adobe AIR, Firebird, and in the version that ships with the System.Data.SQLite .NET provider as well, IIRC. Regards Tim Romano On Mon, Apr 26, 2010 at 9:24 AM, Jean-Christophe Deschampswrote: > At 14:31 26/04/2010, you wrote: > > >If the implementation of SQLite you are using overrides the LIKE operator > >(as more than a few do), then SQLite will not make use of an index on the > >column in question. Use the GLOB operator instead. > > I doubt it. GLOB is absolutely nothing more or less than an invokation > of the same code for LIKE but with slightly different > parameters. Except if people have made a completely differing version, > departing from the architecture of the standard SQLite code (and there > is little reason to, if any) AND have made LIKE and GLOB two completely > distinct functions, there shouldn't be any significant difference in > running time (for equivalent queries, of course). > > Also if ever LIKE is overloaded, then GLOB gets excluded from standard > optimization, except large changes in SQLite code. > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Searching with like for a specific start letter
At 14:31 26/04/2010, you wrote: >If the implementation of SQLite you are using overrides the LIKE operator >(as more than a few do), then SQLite will not make use of an index on the >column in question. Use the GLOB operator instead. I doubt it. GLOB is absolutely nothing more or less than an invokation of the same code for LIKE but with slightly different parameters. Except if people have made a completely differing version, departing from the architecture of the standard SQLite code (and there is little reason to, if any) AND have made LIKE and GLOB two completely distinct functions, there shouldn't be any significant difference in running time (for equivalent queries, of course). Also if ever LIKE is overloaded, then GLOB gets excluded from standard optimization, except large changes in SQLite code. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Searching with like for a specific start letter
Yes. If the OP's [normword] column contains proper nouns, he must normalize to lower case in order to get accurate results from GLOB. Or, if his lexicon contains proper nouns in upper case and normal nouns in lower case, then he could always leave the case intact and use GLOB to get a count of proper nouns versus normal nouns ;-) ... GLOB 'A*' ... GLOB 'a*' Regards Tim Romano On Mon, Apr 26, 2010 at 8:47 AM, Igor Tandetnikwrote: > Tim Romano wrote: > > If the implementation of SQLite you are using overrides the LIKE operator > > (as more than a few do), then SQLite will not make use of an index on the > > column in question. Use the GLOB operator instead. > > > > For example, I have a lexicon containing 263,000 words: > > > > select count(*) from lexicon where spelling like 'a%' // 552 ms on > first > > run and then 355ms on second and subsequent runs > > select count(*) from lexicon where spelling glob 'a*' // 110 ms on first > > run and then ~10ms on second and subsequent runs > > Note that, by default, LIKE is case-insensitive while GLOB is > case-sensitive. Thus, even if not overridden with a custom function, LIKE > cannot be optimized unless the column is declared with NOCASE collation. > Again, this article provides all the details: > http://sqlite.org/optoverview.html#like_opt > -- > Igor Tandetnik > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Searching with like for a specific start letter
Tim Romano wrote: > If the implementation of SQLite you are using overrides the LIKE operator > (as more than a few do), then SQLite will not make use of an index on the > column in question. Use the GLOB operator instead. > > For example, I have a lexicon containing 263,000 words: > > select count(*) from lexicon where spelling like 'a%' // 552 ms on first > run and then 355ms on second and subsequent runs > select count(*) from lexicon where spelling glob 'a*' // 110 ms on first > run and then ~10ms on second and subsequent runs Note that, by default, LIKE is case-insensitive while GLOB is case-sensitive. Thus, even if not overridden with a custom function, LIKE cannot be optimized unless the column is declared with NOCASE collation. Again, this article provides all the details: http://sqlite.org/optoverview.html#like_opt -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Searching with like for a specific start letter
If the implementation of SQLite you are using overrides the LIKE operator (as more than a few do), then SQLite will not make use of an index on the column in question. Use the GLOB operator instead. For example, I have a lexicon containing 263,000 words: select count(*) from lexicon where spelling like 'a%' // 552 ms on first run and then 355ms on second and subsequent runs select count(*) from lexicon where spelling glob 'a*' // 110 ms on first run and then ~10ms on second and subsequent runs Alternatively: select substr(spelling,1,1), count (*) from lexicon group by substr(spelling,1,1) order by substr(spelling,1,1) // ~3500 ms on first run and then ~2400 ms on second and subsequent runs Of course, if your lexicon is static, you could create an ancillary table of first letters and their corresponding counts. Regards Tim Romano 2010/4/25 Alberto Simões> Hello > > I am running on the fly a query to count the number of words starting > with one of the 26 letters. > > I am doing the usual SELECT COUNT(term) from dictionary WHERE normword > LIKE "a%" (for the 26 letters) > > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Searching with like for a specific start letter
When you say "running on the fly" do you mean running from an sqlite3 command prompt? Or are you doing this in some other programming language? Why in the world would you use a database to do this? Michael D. Black Senior Scientist Northrop Grumman Mission Systems From: sqlite-users-boun...@sqlite.org on behalf of Alberto Simões Sent: Sun 4/25/2010 3:39 PM To: General Discussion of SQLite Database Subject: [sqlite] Searching with like for a specific start letter Hello I am running on the fly a query to count the number of words starting with one of the 26 letters. I am doing the usual SELECT COUNT(term) from dictionary WHERE normword LIKE "a%" (for the 26 letters) normword is the term normalized without accents and the like Is there any way to make this query faster? It is taking about 10 second for 140K entries. One idea is to add a column named 'letter' and SELECT COUNT(letter) from dictionary WHERE letter = 'a'. But are there other solutions? Thanks -- Alberto Simões ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Searching with like for a specific start letter
On Sun, 25 Apr 2010 21:39:43 +0100, Alberto Simõeswrote: >Hello > > I am running on the fly a query to count the number of > words starting with one of the 26 letters. > > I am doing the usual SELECT COUNT(term) from dictionary WHERE normword > LIKE "a%" (for the 26 letters) > >normword is the term normalized without accents and the like Would your application allow to return all 26 in one query? SELECT COUNT(term) FROM dictionary WHERE normword >= 'a' AND normword <= 'zz' GROUP BY substr(normword,1,1); (untested, but certainly faster than 26 separate queries) >Is there any way to make this query faster? It is taking about 10 >second for 140K entries. > >One idea is to add a column named 'letter' and SELECT COUNT(letter) >from dictionary WHERE letter = 'a'. >But are there other solutions? > >Thanks -- ( Kees Nuyt ) c[_] ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Searching with like for a specific start letter
On 25 Apr 2010, at 9:39pm, Alberto Simões wrote: > One idea is to add a column named 'letter' and SELECT COUNT(letter) > from dictionary WHERE letter = 'a'. That will be the simplest way to make a fast lookup, though it will slow down your INSERT function. You could speed it up a tiny bit more by making the search field an INTEGER field, and storing, for example, 65 in there for 'a', 66 for 'b', etc.. Two ways to do it: either put the right letter in the column when you create the record, or leave the column at a dummy default value when you create the record and have an 'update index' which updates all records which have the dummy default value. Which one you choose depends on whether your INSERT function has to run faster than it does naturally. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Searching with like for a specific start letter
Alberto Simões wrote: > I am running on the fly a query to count the number of words starting > with one of the 26 letters. > > I am doing the usual SELECT COUNT(term) from dictionary WHERE normword > LIKE "a%" (for the 26 letters) > > normword is the term normalized without accents and the like See if this condition works better: where normword >= 'a' and normword < 'b' See also http://sqlite.org/optoverview.html#like_opt -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Searching with like for a specific start letter
Hello I am running on the fly a query to count the number of words starting with one of the 26 letters. I am doing the usual SELECT COUNT(term) from dictionary WHERE normword LIKE "a%" (for the 26 letters) normword is the term normalized without accents and the like Is there any way to make this query faster? It is taking about 10 second for 140K entries. One idea is to add a column named 'letter' and SELECT COUNT(letter) from dictionary WHERE letter = 'a'. But are there other solutions? Thanks -- Alberto Simões ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users