Re: [sqlite] FTS3 phantom token?...
Thanks Marbex, Puneed and specially to Scott for the snippet() tip. 'ined' is Porter stemmed to 'in'! :-) So that is why it finds so many rows. Now, to find 17081 rows from a table with 25672, fts3 took 4.47 seconds! The query is: select title from fts_news where description match 'in' If I execute a similar query with 'like': select title from fts_news where description like '% in %' It takes only 1.36 seconds. 3.2 x faster for a sequential search vs an fts3 one! The point is that it seems that fts3 gets slower as the number of rows it has to return grows. Jochi Martinez www.bfreenews.com ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS3 phantom token?...
Right, Puneet, this is what I mean. I tried the same query with another database with 25.600 rows: select * from fts_news where description match 'ined' the query returns 17.080 rows which is wrong because If I do: select * from fts_news where description like '% ined %' The query returns 0 rows. I'm using SqliteSpy with sqlite 3.6.17 Jochi Martínez www.bfreenews.com On Sat, Sep 26, 2009 at 6:36 PM, marbex wrote: > > > Shopsland gmail wrote: >> >> ? ?select title from fts_news where fts_news match 'ined' >> > > It looks that you only want to query the title field, so the query should > be: > select title from fts_news where title match 'ined' > or > select title from fts_news where fts_news match 'title:ined' The OP is asking something completely different. Per the OP, there is no word with 'ined' as any part, yet, an FTS3 query is returning matches for 'ined' -- Puneet Kishor ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS3 phantom token?...
On Sat, Sep 26, 2009 at 5:31 PM, P Kishor wrote: > On Sat, Sep 26, 2009 at 6:36 PM, marbex wrote: >> Shopsland gmail wrote: >>> select title from fts_news where fts_news match 'ined' >> >> It looks that you only want to query the title field, so the query should >> be: >> select title from fts_news where title match 'ined' >> or >> select title from fts_news where fts_news match 'title:ined' > > The OP is asking something completely different. Per the OP, there is > no word with 'ined' as any part, yet, an FTS3 query is returning > matches for 'ined' But the porter stemmer is being used, which means that the match may not be literal. I don't know the specifics of porter, but since "ed" is something likely to be tweaked by stemming, and also "in", it's possible this is hitting an edge case of some sort. Use snippet() to figure out what it is that is being matched. -scott ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS3 phantom token?...
On Sat, Sep 26, 2009 at 6:36 PM, marbex wrote: > > > Shopsland gmail wrote: >> >> select title from fts_news where fts_news match 'ined' >> > > It looks that you only want to query the title field, so the query should > be: > select title from fts_news where title match 'ined' > or > select title from fts_news where fts_news match 'title:ined' The OP is asking something completely different. Per the OP, there is no word with 'ined' as any part, yet, an FTS3 query is returning matches for 'ined' -- Puneet Kishor ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS3 phantom token?...
Shopsland gmail wrote: > >select title from fts_news where fts_news match 'ined' > It looks that you only want to query the title field, so the query should be: select title from fts_news where title match 'ined' or select title from fts_news where fts_news match 'title:ined' -- View this message in context: http://www.nabble.com/FTS3-phantom-token-...-tp25594632p25629682.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] FTS3 phantom token?...
Hi, Mining my fts3 table stemmed with Porter I found that there is a 'phantom' token indexed by fts3. When I issue this query: select title from fts_news where fts_news match 'ined' I got about 700 rows that do not have the word 'ined' in any form (ie: ined or inedible). If I issue those queries: select title from fts_news where title like 'ined%' select title from fts_news where title like '% ined%' Not a single row is returned. I have optimized the fts3 index with: SELECT optimize(fts_news) FROM fts_news LIMIT 1; And did a vacuum but the phantom token 'ined' is not deleted. Can anyone confirm this? Jochi Martínez www.bfreenews.com ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users