subject:"\\\\\\\[sqlite\\\\\\\] FTS"

Re: [sqlite] small sqlite fts snippet function or Fts Bug!

2015-01-27 Thread Dan Kennedy


On 01/27/2015 06:48 PM, boscowitch wrote:



and the in an sqlite shell (SQLite version 3.8.8.1 2015-01-20 16:51:25)
I get following for a select with snippet:

EXAMPLE OUTPUT:
sqlite select docid,*,snippet(test) from test where german match a;
1|[1] a b c|1] ba/b b c
2|[{[_.,:;[1] a b c|1] ba/b b c
3|1[1] a b c|1[1] ba/b b c
4|[1] a b c|1] a bb/b c
5|[1] a b c|[1] ba/b b c



-As you can see for id 1 and 2 b is at the right position
but all beginning non-alphanumerical [,{, etc. are just left out in the
snippet.


-ID 4 does not help and breaks the offsets so even worse



Thanks for reporting this. The issue with (1) and (2) is now fixed here:

  http://www.sqlite.org/src/info/adc9283dd9b

I think it is a bug in the input data causing the problem in (4). The 
values inserted into test and testdata are just slightly different.


Dan.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] small sqlite fts snippet function or Fts Bug!

2015-01-27 Thread boscowitch

Yeah, -ID 4 was just a desperate experiment for a hack with longer data
in the search to see if it would lead the snippet function to start
grabbing the data from the start (or at least one word/char more).

The offsets beeing wrong and therefore the b... was kinda expected of
me, but in case it worked I would have manually substracted the offset
and put the markers in myself... so it wasn't part of the bug report
just a test how it behaves in that case.

Good to know for the future that its already fixed, thx for taking care
of it so fast!

boscowitch

Am Mittwoch, den 28.01.2015, 02:04 +0700 schrieb Dan Kennedy:
 On 01/27/2015 06:48 PM, boscowitch wrote:
 
 
  and the in an sqlite shell (SQLite version 3.8.8.1 2015-01-20 16:51:25)
  I get following for a select with snippet:
 
  EXAMPLE OUTPUT:
  sqlite select docid,*,snippet(test) from test where german match a;
  1|[1] a b c|1] ba/b b c
  2|[{[_.,:;[1] a b c|1] ba/b b c
  3|1[1] a b c|1[1] ba/b b c
  4|[1] a b c|1] a bb/b c
  5|[1] a b c|[1] ba/b b c
 
 
 
  -As you can see for id 1 and 2 b is at the right position
  but all beginning non-alphanumerical [,{, etc. are just left out in the
  snippet.
 
 
  -ID 4 does not help and breaks the offsets so even worse
 
 
 Thanks for reporting this. The issue with (1) and (2) is now fixed here:
 
http://www.sqlite.org/src/info/adc9283dd9b
 
 I think it is a bug in the input data causing the problem in (4). The 
 values inserted into test and testdata are just slightly different.
 
 Dan.
 
 
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] small sqlite fts snippet function or Fts Bug!

2015-01-27 Thread boscowitch

Hello since it this bug report (+ a dirty-fix) it might be useful for
both users and devs.
that's why I send a copy to both mailing lists! 
I hope I don't bother the diligent devs who read all of both list, sry
to them, and thx for sqlite btw. ;)!

recently I wanted to use the snippet function in sqlite for my small
sqlite dictionary (running on android but the bug occurs also on my
linux desktop).

but it behaved strangely when my entry started with non-words
character(s) (not alphanumeric and all Unicode (or chars128) in short
simple tokenizer delimiters)

the snippet never prints them if they are in the beginning  of the first
word
here an examples to demonstrate:

EXAMPLE SETUP SQL:
create table testdata (german);
create virtual table test using fts4(content=testdata,german);

insert into testdata(german) VALUES ([1] a b c);
insert into test(docid,german) VALUES(1,[1] a b c );

insert into testdata(german) VALUES ([{[_.,:;[1] a b c);
insert into test(docid,german) VALUES(2,[{[_.,:;[1] a b c ); 

insert into testdata(german) VALUES (1[1] a b c);
insert into test(docid,german) VALUES(3,1[1] a b c ); 

insert into testdata(german) VALUES ([1] a b c);
insert into test(docid,german) VALUES(4,1[1] a b c ); 

insert into testdata(german) 
VALUES(char(8203,91,49,93,32,97,32,98,32,99));
insert into test(docid,german)
VALUES(5,char(8203,91,49,93,32,97,32,98,32,99));


and the in an sqlite shell (SQLite version 3.8.8.1 2015-01-20 16:51:25)
I get following for a select with snippet:

EXAMPLE OUTPUT:
sqlite select docid,*,snippet(test) from test where german match a;
1|[1] a b c|1] ba/b b c
2|[{[_.,:;[1] a b c|1] ba/b b c
3|1[1] a b c|1[1] ba/b b c
4|[1] a b c|1] a bb/b c
5|[1] a b c|[1] ba/b b c



-As you can see for id 1 and 2 b is at the right position
but all beginning non-alphanumerical [,{, etc. are just left out in the
snippet.

-ID 3 works but has an additional 1 that should not be there so no
solution...

-ID 4 does not help and breaks the offsets so even worse

-ID 5 works BUT this is a dirty fix i found.
it adds an Unicode character ('ZERO WIDTH SPACE' (U+200B)) in front
which obviously cant be seen and doesnt break the offsets (just shifts
them all +1)
I didn't test it yet on android but I hope so, since it supports
Unicode ... 
obviously this is not a nice solution or one for more simpler/embedded
systems.


(btw. the same bug occurs also with fts3 and also with no special
content option)
here a small example for normal fts4 with a more custom snippet call:

create virtual table test using fts4(german);
insert into test VALUES([1] a b c);

sqlite select *,snippet(test,#,#,...,0,64) from test where german
match 'a';
[1] a b c|1] #a# b c



regards boscowitch

PS: please excuse the german ;) and all English spelling errors 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS pagination

2014-10-27 Thread supermariobros

Just to be clear. It basically means that after MATCH  records are returned
it iterates through ALL the rowids of the returned set and removes them from
the set and orders them accordingly.



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/FTS-pagination-tp78754p78849.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS pagination

2014-10-26 Thread Clemens Ladisch

supermariobros wrote:
 Well, they all give exactly the same output.

 sqlite EXPLAIN QUERY PLAN SELECT rowid  FROM activity_text_content WHERE 
 activity_text_content MATCH 'x' ORDER BY rowid ASC LIMIT 100;
 0|0|0|SCAN TABLE activity_text_content VIRTUAL TABLE INDEX 4:ASC (~0 rows)

 sqlite EXPLAIN QUERY PLAN SELECT rowid  FROM activity_text_content WHERE 
 activity_text_content MATCH 'x' AND rowid1000 ORDER BY rowid ASC LIMIT 10;
 0|0|0|SCAN TABLE activity_text_content VIRTUAL TABLE INDEX 4:ASC (~0 rows)

 sqlite EXPLAIN QUERY PLAN SELECT rowid  FROM activity_text_content WHERE 
 activity_text_content MATCH 'x' LIMIT 100;
 0|0|0|SCAN TABLE activity_text_content VIRTUAL TABLE INDEX 4: (~0 rows)

 It almost looks like EXPLAIN ignores the second part where rowid is compared
 or sorted.

It doesn't change the way in which the database accesses the table, which
implies that the rowid comparison is not using any index, i.e., the FTS
module first computes the results, and then the rows with small rowid
values are thrown away.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS pagination

2014-10-25 Thread supermariobros

Well, they all give exactly the same output.

sqlite EXPLAIN QUERY PLAN SELECT rowid  FROM activity_text_content WHERE
activity_text_content MATCH 'x' ORDER BY rowid ASC LIMIT 100; 
0|0|0|SCAN TABLE activity_text_content VIRTUAL TABLE INDEX 4:ASC (~0 rows)

sqlite EXPLAIN QUERY PLAN SELECT rowid  FROM activity_text_content WHERE
activity_text_content MATCH 'x' AND rowid1000 ORDER BY rowid ASC LIMIT 10; 
0|0|0|SCAN TABLE activity_text_content VIRTUAL TABLE INDEX 4:ASC (~0 rows)

sqlite EXPLAIN QUERY PLAN SELECT rowid  FROM activity_text_content WHERE
activity_text_content MATCH 'x' LIMIT 100; 0|0|0|SCAN TABLE
activity_text_content VIRTUAL TABLE INDEX 4: (~0 rows)


It almost looks like EXPLAIN ignores the second part where rowid is compared
or sorted.
how should I understand that?




--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/FTS-pagination-tp78754p78831.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS pagination

2014-10-24 Thread supermariobros

Hi 
Thanks For quick response

Of course you are right that I can not use row id in the way I used it
above. I guess I wrote it quicker than I thought about it.
However If I use original rowid and LIMIT it should be fine, knowing that
the submited rowid is the rowid of the last element of the previous set.
Like this:
SELECT rowid  FROM text_content WHERE  text_content MATCH 'x' AND rowid
1000 ORDER BY rowid LIMIT 10;
of course it only works when last rowid of the previous set is available so
I can not just skip to the N-th page, wchich is fine in my case.  I just
have to know first and last element to go back and forward;

This is what explain query shows:

EXPLAIN QUERY PLAN SELECT rowid  FROM activity_text_content WHERE
activity_text_content MATCH 'x' AND rowid1000 ORDER BY rowid ASC LIMIT 10;
0|0|0|SCAN TABLE activity_text_content VIRTUAL TABLE INDEX 4: (~0 rows)  

If I understand it correctly it uses indexes properly on FTS but I do not
know how the row id scanning and sorting is done.
If you have some time could you shed some light on this.

Also when I compare time of the querry of the one with rowid  to the one
with OFFSET query on 5000 records I do not see any difference in execution
time. I read just like you wrote that OFFSET starts to be time consuming
when number is very high but what is the high number case. I doubt I will
ever deal with more than 1 records in this entire virtual table so
should I even be concerned?

Thanks






--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/FTS-pagination-tp78754p78772.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS pagination

2014-10-24 Thread Clemens Ladisch

supermariobros wrote:
 EXPLAIN QUERY PLAN SELECT rowid  FROM activity_text_content WHERE
 activity_text_content MATCH 'x' AND rowid1000 ORDER BY rowid ASC LIMIT 10;
 0|0|0|SCAN TABLE activity_text_content VIRTUAL TABLE INDEX 4: (~0 rows)

 If I understand it correctly it uses indexes properly on FTS but I do not
 know how the row id scanning and sorting is done.

Compare the EXPLAIN QUERY PLAN output of this query and of the same
query without the rowid comparison.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS pagination

2014-10-23 Thread supermariobros

Or maybe, if I am using android, it should be done at the cursor level?



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/FTS-pagination-tp78754p78755.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS pagination

2014-10-23 Thread supermariobros

Quick question. If I am using FTQ that looks like this SELECT * FROM mail
WHERE body MATCH 'sqlite'  can I add to it WHERE rowid  5 AND rwoid 10
or it will significantly slow it down.  If so what would be the best
approach for pagination, For example if I get 500 rows with the matching
term and obviously I do not need them all right away. Should I just use the
first option and maybe select only rowids and then go through them? 

Any suggestions?


Thanks



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/FTS-pagination-tp78754.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS pagination

2014-10-23 Thread Clemens Ladisch

supermariobros wrote:
 If I am using FTQ that looks like this
 SELECT * FROM mail WHERE body MATCH 'sqlite'
 can I add to it WHERE rowid  5 AND rwoid 10
 or it will significantly slow it down.

How much did it slow down when you tested it?

Anyway,
without index:
  sqlite EXPLAIN QUERY PLAN SELECT * FROM t;
  0|0|0|SCAN TABLE t VIRTUAL TABLE INDEX 0:
with word search:
  sqlite EXPLAIN QUERY PLAN SELECT * FROM t WHERE t MATCH 'x';
  0|0|0|SCAN TABLE t VIRTUAL TABLE INDEX 3:
your query:
  sqlite EXPLAIN QUERY PLAN SELECT * FROM t WHERE t MATCH 'x' AND rowid 
BETWEEN 5 AND 10;
  0|0|0|SCAN TABLE t VIRTUAL TABLE INDEX 393219:

So it uses some index (and the lowest two bits are still set, so
it still uses the FTS index, but this is an implementation detail.)

This might be different with a different SQLite version.


 If so what would be the best approach for pagination

The rowid cannot be used for pagination because you get the
numbers of the original rows.

You would have to use OFFSET/LIMIT, which is inefficient for
large offsets.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS full-text query vs. query by rowid

2014-08-28 Thread Clemens Ladisch

Milan Kříž wrote:
 3) A query which should use a linear scan according to the SQLite
documentation (http://www.sqlite.org/fts3.html#section_1_4)
 SELECT docId FROM ftsTable WHERE docId BETWEEN 20 AND 23
 - gets a following query plan:
 SCAN TABLE ftsTable VIRTUAL TABLE INDEX 393216:
 - the documentation does not say a word about another indexes on an
   FTS table, so where is the index 393216 come from?

This is an undocumented optimization.  In recent versions, FTS also
optimizes docid searches with less/greater than operators.

(The index number is an implementation detail.)

 4) The I have a query with both 'match ?' sub-clause and 'rowid=?'
sub-clause. It is not clear to me which variant will be used.
 But according to definition of Full-text query it should use full-text
 query at first. And then? Will it use index to rowid after full-text
 query is performed?
 SELECT docId FROM ftsTable WHERE ftsTable MATCH 'a*' AND rowId IN (20,21, 
 22, 23)
 - anyway from the query plan it seems that no full-text query is
   performed at all - or how to interpret it?:
 SCAN TABLE ftsTable VIRTUAL TABLE INDEX 1:
 EXECUTE LIST SUBQUERY 1

INDEX 1 is the full-text search.  The rowid values of the returned
rows are then compared (by SQLite, outside of FTS) against the list.

 Could you please give me a clue how to guess whether a complex FTS
 query will use a full-text index and which one it will use?

There is only one full-text index per table.

The FTS module implements a search/lookup iff the EXPLAIN QUERY PLAN
output shows VIRTUAL TABLE INDEX.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS full-text query vs. query by rowid

2014-08-28 Thread Milan Kříž


Clemens Ladisch wrote:

4) The I have a query with both 'match ?' sub-clause and 'rowid=?'
sub-clause. It is not clear to me which variant will be used.
But according to definition of Full-text query it should use full-text
query at first. And then? Will it use index to rowid after full-text
query is performed?
 SELECT docId FROM ftsTable WHERE ftsTable MATCH 'a*' AND rowId IN (20,21, 
22, 23)
- anyway from the query plan it seems that no full-text query is
   performed at all - or how to interpret it?:
 SCAN TABLE ftsTable VIRTUAL TABLE INDEX 1:
 EXECUTE LIST SUBQUERY 1

INDEX 1 is the full-text search.  The rowid values of the returned
rows are then compared (by SQLite, outside of FTS) against the list.


Ok, it would be what I expect. But according to my first two queries 1) and 2), it looks like a full-text index 
is the *index 18*.



1) A*full-text query*
SELECT docId FROM ftsTableWHERE ftsTable MATCH 'a*'
- gets a following query plan:
  SCAN TABLE ftsTable VIRTUAL TABLE*INDEX**18*:

2) A*query by rowid*
  SELECT docId FROM ftsTable WHERE docid = 10
- gets a following query plan:
  SCAN TABLE ftsTable VIRTUAL TABLE*INDEX 1*:


So maybe the query plan shows a wrong number? Or is the index number unrelated 
to a full-text index?
I thought that INDEX 1 is an index to rowid, but maybe index numbers are 
somehow fuzzy? :-).

Thanks,
Milan

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS full-text query vs. query by rowid

2014-08-28 Thread Clemens Ladisch

Milan Kříž wrote:
 Clemens Ladisch wrote:
 INDEX 1 is the full-text search.

Sorry, that's wrong.

The idxNum value is determined as follows: (see fts3Int.h)

/*
** The Fts3Cursor.eSearch member is always set to one of the following.
** Actualy, Fts3Cursor.eSearch can be greater than or equal to
** FTS3_FULLTEXT_SEARCH.  If so, then Fts3Cursor.eSearch - 2 is the index
** of the column to be searched.  For example, in
**
** CREATE VIRTUAL TABLE ex1 USING fts3(a,b,c,d);
** SELECT docid FROM ex1 WHERE b MATCH 'one two three';
**
** Because the LHS of the MATCH operator is 2nd column b,
** Fts3Cursor.eSearch will be set to FTS3_FULLTEXT_SEARCH+1.  (+0 for a,
** +1 for b, +2 for c, +3 for d.)  If the LHS of MATCH were ex1
** indicating that all columns should be searched,
** then eSearch would be set to FTS3_FULLTEXT_SEARCH+4.
*/
#define FTS3_FULLSCAN_SEARCH 0/* Linear scan of %_content table */
#define FTS3_DOCID_SEARCH1/* Lookup by rowid on %_content table */
#define FTS3_FULLTEXT_SEARCH 2/* Full-text index search */

/*
** The lower 16-bits of the sqlite3_index_info.idxNum value set by
** the xBestIndex() method contains the Fts3Cursor.eSearch value described
** above. The upper 16-bits contain a combination of the following
** bits, used to describe extra constraints on full-text searches.
*/
#define FTS3_HAVE_LANGID0x0001  /* languageid=? */
#define FTS3_HAVE_DOCID_GE  0x0002  /* docid=? */
#define FTS3_HAVE_DOCID_LE  0x0004  /* docid=? */


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS full-text query vs. query by rowid

2014-08-28 Thread Milan Kříž


Clemens Ladisch wrote:

Milan Kříž wrote:

Clemens Ladisch wrote:

INDEX 1 is the full-text search.

Sorry, that's wrong.


So does it mean that the full-text search is not performed for the following 
query at all?
And that only the docId index is used to get entries in the IN sub-clause
and then a linear scan with a comparison to 'a*' is done?

query:
SELECT docId FROM ftsTable WHERE ftsTable MATCH 'a*' AND rowId IN (20,21, 22, 
23)
query plan:
SCAN TABLE ftsTable VIRTUAL TABLE INDEX 1:
EXECUTE LIST SUBQUERY 1

Is it possible to force SQLite to use the full-text search instead of the rowid 
search?
I can think only about something like this:
select docId from (
select docId from ftsTable where ftsTable match 'a*'
) where docId in (21, 22, 23, 24)
query plan:
SCAN TABLE nameFtsTable VIRTUAL TABLE INDEX 18:
EXECUTE LIST SUBQUERY 1
It looks much better.  But what does the 'EXECUTE LIST SUBQUERY 1' mean? You wrote that returned values are 
compared by SQLite (outside of FTS).
But does it use some index (rowid index) or is it impossible for SQLite to use an index on the same table (even 
if the first one - full-text index - was used in a subquery)?


Of course, in my real use-case I have much more complex docId condition and I have a lot of entries in an FTS 
table (about million entries),
so I would like the full-text search to prune the results first and then filter results using docId. FTS 
condition is also much more complex than just 'a*' so I expect that

many results will be filtered out by the full-text query.

Thanks,
Milan

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS full-text query vs. query by rowid

2014-08-28 Thread Clemens Ladisch

Milan Kříž wrote:
 So does it mean that the full-text search is not performed for the following 
 query at all?

 SELECT docId FROM ftsTable WHERE ftsTable MATCH 'a*' AND rowId IN (20,21, 22, 
 23)

 SCAN TABLE ftsTable VIRTUAL TABLE INDEX 1:
 EXECUTE LIST SUBQUERY 1

No, it means that you are using a different version.

(In any case, it is not possible to execute MATCH without the FTS index.)

 what does the 'EXECUTE LIST SUBQUERY 1' mean?

It's how the rowid IN (...) is implemented.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS full-text query vs. query by rowid

2014-08-28 Thread Milan Kříž


Dne 28.8.2014 14:54, Clemens Ladisch napsal(a):

Milan Kříž wrote:

So does it mean that the full-text search is not performed for the following 
query at all?

SELECT docId FROM ftsTable WHERE ftsTable MATCH 'a*' AND rowId IN (20,21, 22, 
23)

 SCAN TABLE ftsTable VIRTUAL TABLE INDEX 1:
 EXECUTE LIST SUBQUERY 1

No, it means that you are using a different version.

(In any case, it is not possible to execute MATCH without the FTS index.)


Thanks and sorry for bothering you with such details. But I still cannot 
understand that query plan.
Since for a simple rowid query it says:
explain query plan
select * from ftsTable where docId = 100
 SCAN TABLE ftsTable VIRTUAL TABLE INDEX 1:
It is clear to me that index 1 is definitely a rowId index.

Then for complex query above, it says:

SCAN TABLE ftsTable VIRTUAL TABLE INDEX 1:
EXECUTE LIST SUBQUERY 1

So where is a mention about using an FTS index (full-text query)?
Or is it just an imperfection of 'explain query plan' that the usage of the full-text index is missing from the 
query plan?


I use SQLite version 3.8.5 and for full table full-text search it gives a correct index according to comment in 
the fts3Int.h:


FTS3_FULLTEXT_SEARCH+${NUMBER_OF_COLUMNS} = VIRTUAL TABLE INDEX 18 in my 
ftsTable

Regards,
Milan

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS full-text query vs. query by rowid

2014-08-28 Thread Clemens Ladisch

Milan Kříž wrote:
 Dne 28.8.2014 14:54, Clemens Ladisch napsal(a):
 Milan Kříž wrote:
 So does it mean that the full-text search is not performed for the 
 following query at all?

 No, it means that you are using a different version.

 But I still cannot understand that query plan.

Then try with 3.8.6.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS full-text query vs. query by rowid

2014-08-28 Thread Milan Kříž



Dne 28.8.2014 16:48, Clemens Ladisch napsal(a):

Then try with 3.8.6.


Ouuu . .sorry again.
I have tested it with 3.8.6 and the query plan looks ok now.
SCAN TABLE nameftsTable VIRTUAL TABLE INDEX 18:
EXECUTE LIST SUBQUERY 1

But I also tested it with my version again and I'm getting the same (correct) 
query plan now.
I cannot understand how it is possible :-).
The only thing I changed is that in the meantime I've recreated my ftsTable.
So it was either in some strange state I cannot reproduce now or I am simply 
overworked :-).
Or maybe I accidentally used '=' or 'is' instead of 'match'.

However, many thanks for your support, now it all seems much more clear.

Milan

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS full-text query vs. query by rowid

2014-08-27 Thread Milan Kříž


Hello,
I would like to ask several questions regarding to SQLite FTS module and how it 
uses indexes.

I have following queries:

1) A full-text query
SELECT docId FROM ftsTableWHERE ftsTable MATCH 'a*'
- gets a following query plan:
SCAN TABLE ftsTable VIRTUAL TABLE INDEX 18:

2) A query by rowid
SELECT docId FROM ftsTable WHERE docid = 10
- gets a following query plan:
SCAN TABLE ftsTable VIRTUAL TABLE INDEX 1:

So far this is as expected.

3) A query which should use a linear scan according to the SQLite documentation 
(http://www.sqlite.org/fts3.html#section_1_4)

SELECT docId FROM ftsTable WHERE docId BETWEEN 20 AND 23
- gets a following query plan:
SCAN TABLE ftsTable VIRTUAL TABLE INDEX 393216:
- the documentation does not say a word about another indexes on an FTS table, so where is the index 393216 come 
from?


4) The I have a query with both 'match ?' sub-clause and 'rowid=?' sub-clause. It is not clear to me which 
variant will be used.
But according to definition of Full-text query it should use full-text query at first. And then? Will it use 
index to rowid after full-text query is performed?

SELECT docId FROM ftsTable WHERE ftsTable MATCH 'a*' AND rowId IN (20,21, 
22, 23)
- anyway from the query plan it seems that no full-text query is performed at 
all - or how to interpret it?:
SCAN TABLE ftsTable VIRTUAL TABLE INDEX 1:
EXECUTE LIST SUBQUERY 1

5) When between is used instead of in, SQLite should use a linear scan, but 
query plan says something else:
select docId from ftsTable where ftsTable match 'a*' and docId between 20 
and 23
- query plan
SCAN TABLE ftsTable VIRTUAL TABLE INDEX 393234:
- even another index (393234) seems to be used but again, where does it come 
from?

Is there any undocumented optimizations in the FTS module or did I miss some 
note in the SQLite documentation?
Could you please give me a clue how to guess whether a complex FTS query will use a full-text index and which 
one it will use?

And could you explain what the different numbers of indexes means? Are all 
indexes really created in a DB?

Thanks for you explanation,
Milan

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS module and DB close

2014-06-08 Thread gwenn

Ok,
Maybe the solution is:
1) try to close the connection: sqlite3_close
2) if error code is SQLITE_BUSY,
 a) use sqlite3_next_stmt to finalize dangling statements
 b) retry to close the connection
Step (1) ensures that FTS related statements are finalized.
Thanks.

On Sat, Jun 7, 2014 at 7:49 PM, gwenn gwenn.k...@gmail.com wrote:
 Hello,
 How do you prevent double free/finalize of statements created by the
 FTS module ?
 I am using sqlite3_next_stmt to finalize all dangling statements
 before closing the connection but the program crashes because the FTS
 module finalizes them too when sqlite3_close is called...
 May be I should use sqlite3_close_v2 (not available on MacOS X: SQLite
 version 3.7.13) ?

 Thanks
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS module and DB close

2014-06-07 Thread gwenn

Hello,
How do you prevent double free/finalize of statements created by the
FTS module ?
I am using sqlite3_next_stmt to finalize all dangling statements
before closing the connection but the program crashes because the FTS
module finalizes them too when sqlite3_close is called...
May be I should use sqlite3_close_v2 (not available on MacOS X: SQLite
version 3.7.13) ?

Thanks
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS performance (mobile devices)

2013-06-20 Thread Jan Slodicka

 be used for small
changes only.
In case of bulk actions, the triggers should be dropped and the FTS index
rebuilt after the action completes.

2. Rebuilding FTS index takes a lot of memory, whereby the amount does not
depend on the table size.
   Question: Would it be possible to reduce this number? Apparently FTS
index can be built with less memory.

3. Don't use FTS index optimization. Small gains in terms of speed, risk of
high memory consumption.

4. There is some problem in SQLite FTS code as the described crash proves.
(Might relate to empty index only)

5. DB size: This is not described above, but it is important to realize,
that SQLite DB does not shrink.
   FTS index takes compareble space to the data being indexed and it will be
stored first in the WAL log, then in the DB itself.
   In the worst case the DB size may blow up 2-3x.
   Solution for the WAL log: PRAGMA journal_size_limit=10485760 (fast and
easy)
   Solution for the DB: VACUUM (slow, that's way difficult)

Any questions/comments are welcome.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS Find Tokens at Record Start

2013-04-05 Thread Clemens Ladisch

Paul Vercellotti wrote:
 using FTS, how do you match records that contain certain tokens beginning at 
 the start of the record

Apparently, this information is not stored in the FTS index.

Search for the tokens, then manually check with LIKE or something like that.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS Find Tokens at Record Start

2013-04-05 Thread Fabian Klebert

You might want to check the following:

SELECT word FROM fts WHERE fts MATCH '^token'

Beginning with 3.7.9 this should only return records that have 'token' at the 
beginning of the record.
See changelog of 3.7.9:

If a search token (on the right-hand side of the MATCH operator) in FTS4 begins 
with ^ then that token must be the first in its field of the document. ** 
Potentially Incompatible Change **



-Ursprüngliche Nachricht-
Von: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
Im Auftrag von Clemens Ladisch
Gesendet: Freitag, 5. April 2013 15:41
An: sqlite-users@sqlite.org
Betreff: Re: [sqlite] FTS Find Tokens at Record Start

Paul Vercellotti wrote:
 using FTS, how do you match records that contain certain tokens beginning at 
 the start of the record

Apparently, this information is not stored in the FTS index.

Search for the tokens, then manually check with LIKE or something like that.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS Find Tokens at Record Start

2013-04-04 Thread Paul Vercellotti



Hi there,

I couldn't find this from the documentation: using FTS, how do you match 
records that contain certain tokens beginning at the start of the record (or 
any token position for that matter).

For example, I want to match records that start with Four score and seven 
years ago but not match records that contain that phrase in the middle.

This matches any document that contains the phrase:
SELECT rowid FROM documents WHERE content MATCH 'Four score and seven years 
ago';
But I want only the results that start with that phrase, which would be a 
subset of those results. 

It looks like I could programmatically parse the output of the offsetsfunction 
to find this info and manually filter my results, but is there a way to set up 
the query so it does the filtering for me, and only returns results that start 
at byte offset 0 in the column (or token 0)?


Thanks!
-Paul
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS problem with 'NOT term' queries

2013-02-03 Thread Alexey Pechnikov

Hello!


 And as result it's impossible to search docs in some situations:
 SELECT * FROM docs WHERE docs MATCH 'NOT sqlite';
 Error: malformed MATCH expression: [NOT sqlite]


 As far as I can tell, in MATCH syntax NOT is a binary operator, denoting
 set difference. You are trying to use it as a unary operator.


Well, and how to rewrite query MATCH 'NOT sqlite'? In simple FTS qery
syntax is possible to use -sqlite but is the equal construction by
extended syntax?


Best  regards,
Alexey.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS problem with 'NOT term' queries

2013-02-03 Thread Dan Kennedy


On 02/04/2013 12:18 AM, Alexey Pechnikov wrote:

Hello!




And as result it's impossible to search docs in some situations:
SELECT * FROM docs WHERE docs MATCH 'NOT sqlite';
Error: malformed MATCH expression: [NOT sqlite]



As far as I can tell, in MATCH syntax NOT is a binary operator, denoting
set difference. You are trying to use it as a unary operator.



Well, and how to rewrite query MATCH 'NOT sqlite'? In simple FTS qery
syntax is possible to use -sqlite but is the equal construction by
extended syntax?


Not possible. And note that the query -sqlite doesn't actually work
either. Always returns an empty set.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS SQLite optimize command creates very large BLOBs in *_segment table.

2013-01-31 Thread Dominique Pellé

Hi

I have a database using SQLite-3.7.14 with a FTS4 virtual table (Free
Text Search). The FTS table contains several millions of small documents.

The FTS DB is created on a server (where creating time does not matter)
and then used on an embedded device as a read-only database for FTS
queries (where speed of queries need to be as fast as possible and use
as little memory as possible).

Since the DB is used read-only on the device, I ran the the optimize
command (see http://www.sqlite.org/fts3.html#optimize) in the FTS index
after the last INSERT was done on the server.

Unfortunately, I found that the optimize command is creating very large
BLOBs in the database since it merges together all of inverted index b-trees
into very large complete b-tree. For frequent terms, merged BLOBs can be
up to ~50MB in my case in the ftsTable_segment table, as a result of
running optimize:

  sqlite SELECT *, length(block) AS len_blob
FROM ftsTable_segments
ORDER BY len_blob DESC LIMIT 3;

  seblockid|block|len_blob
  336808||51867353 -- 51 MB BLOB!
  311724||19375541 -- 19 MB BLOB
  334719||19223423 -- 19 MB BLOB

Such large BLOBs (~50MB) are a problem in my case as they consume
a  large amount of memory on a embedded device, when doing FTS
queries with several frequent terms. SQLite memory high watermark
reported by sqlite3_memory_highwater() reaches ~200MB when query
contains several frequent terms, which is too much for an embedded
device, even though I set of soft memory limit of only 3MB using
sqlite3_soft_heap_limit64(...).

As a result, I have disabled running optimize on the FTS index
after creating the FTS DB on the server.

However, it got me thinking: it would be nice to be able to pass an
optional parameter to the FTS optimize command in order to
avoid merging b-trees when BLOBs reach a certain size?  In other
words, instead of doing...

INSERT INTO ftsTable(ftsTable) VALUES('optimize');

... I would like to be able to do something like this...

INSERT INTO ftsTable(ftsTable) VALUES('optimize=1048576');

... where optimize=1048576 indicates to *partially* optimize
reverse index b-trees in such a way that BLOBs do not exceed
1MB (1048577 bytes) in this example. It's OK if it's a fuzzy soft limit.


1/ Wouldn't such partial optimization of FTS index be useful?

2/ I also suggest that the documentation at
http://www.sqlite.org/fts3.html#optimize
indicates that optimizing an FTS index can create very large BLOBs.
This may be OK on a desktop or server, but it can be a problem
on embedded devices with limited amount of memory.

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS problem with 'NOT term' queries

2013-01-29 Thread Alexey Pechnikov

Hello!

From
http://www.sqlite.org/fts3.html#section_3_1
we can see the query

SELECT * FROM docs WHERE docs MATCH 'database NOT sqlite';

But the equal query doesn't works:

SELECT * FROM docs WHERE docs MATCH 'NOT sqlite AND database';
Error: malformed MATCH expression: [NOT sqlite AND database]

And as result it's impossible to search docs in some situations:
SELECT * FROM docs WHERE docs MATCH 'NOT sqlite';
Error: malformed MATCH expression: [NOT sqlite]



-- 
Best regards, Alexey Pechnikov.
http://pechnikov.tel/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS problem with 'NOT term' queries

2013-01-29 Thread Igor Tandetnik


On 1/29/2013 11:30 PM, Alexey Pechnikov wrote:

From
http://www.sqlite.org/fts3.html#section_3_1
we can see the query

SELECT * FROM docs WHERE docs MATCH 'database NOT sqlite';

But the equal query doesn't works:

SELECT * FROM docs WHERE docs MATCH 'NOT sqlite AND database';
Error: malformed MATCH expression: [NOT sqlite AND database]

And as result it's impossible to search docs in some situations:
SELECT * FROM docs WHERE docs MATCH 'NOT sqlite';
Error: malformed MATCH expression: [NOT sqlite]


As far as I can tell, in MATCH syntax NOT is a binary operator, denoting 
set difference. You are trying to use it as a unary operator.

--
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS problem with 'NOT term' queries

2013-01-29 Thread Kevin Benson

The explanation right above that table of examples contains these important
(I believe) phrases:

...BINARY SET operators...
...TWO operands to an operator...

(emphasis mine)
On Tue, Jan 29, 2013 at 11:30 PM, Alexey Pechnikov
pechni...@mobigroup.ruwrote:

 SELECT * FROM docs WHERE docs MATCH 'NOT




--
   --
  --
 --Ô¿Ô--
K e V i N
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS questions

2013-01-12 Thread moti lahiani

Hello all

I'm new with sqlite3 and sql.

I have data base that include path columns (file system path like c:\bla
bla\myFiles\1.txt)

On that columns I need to do

1)  search for patterns in case the user want to find a file or
directory

2)  search for prefix path in case the user rename a file or directory

the Database include about 1.5-2.5 million records and to use the LIKE is
not possible because the result time.

As an alternative I want to use FTS3 or FTS4 but I think I have a problems
with what I read here: http://www.sqlite.org/fts3.html#section_1_4

And here: http://www.sqlite.org/fts3.html#section_6_3

I need to specify the language to FTS to use it as tokenize but the path
can include multi languages how can I configure the FTS table to use all
languages

How can I tell to FTS to token the path only according to the character \
?



More than that when creating FTS table it creates with the TEMPORARY  key
word. My question is: do I need to create this table each time I run the
data base   (because the temporary word) or for each connections (in case
of multiconnections) or this is a table like all tables I declared and it
stay in the data base even if I restart my PC



Thanks a lot
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS questions

2013-01-12 Thread moti lahiani

Hello all

I'm new with sqlite3 and sql.

I have data base that include path columns (file system path like c:\bla
bla\myFiles\1.txt)

On that columns I need to do

1)  search for patterns in case the user want to find a file or
directory

2)  search for prefix path in case the user rename a file or directory

the Database include about 1.5-2.5 million records and to use the LIKE is
not possible because the result time.

As an alternative I want to use FTS3 or FTS4 but I think I have a problems
with what I read here: http://www.sqlite.org/fts3.html#section_1_4

And here: http://www.sqlite.org/fts3.html#section_6_3

I need to specify the language to FTS to use it as tokenize but the path
can include multi languages how can I configure the FTS table to use all
languages

How can I tell to FTS to token the path only according to the character \
?



More than that when creating FTS table it creates with the TEMPORARY  key
word. My question is: do I need to create this table each time I run the
data base   (because the temporary word) or for each connections (in case
of multiconnections) or this is a table like all tables I declared and it
stay in the data base even if I restart my PC



Thanks a lot
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS questions

2013-01-12 Thread Michael Black

I'm not sure I understand your problem.
Why do you care what language it is?  Aren't you just wanting to tokenize on
backslash?

Simple way is to replace all spaces in the path with another char (e.g. '_')
then replace all backslashes with a space.
Then you can just use the default tokenizer and make the same changes on any
user queries.  So you map the user query to your storage format.

Not sure why you would want to use some stem tokenizer on paths.

And, of course, you're reinventing the wheel unless you have some special
purpose in mind.

http://locate32.cogit.net/

-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of moti lahiani
Sent: Saturday, January 12, 2013 4:37 AM
To: sqlite-users@sqlite.org
Cc: Moti LAHIANI
Subject: [sqlite] FTS questions

Hello all

I'm new with sqlite3 and sql.

I have data base that include path columns (file system path like c:\bla
bla\myFiles\1.txt)

On that columns I need to do

1)  search for patterns in case the user want to find a file or
directory

2)  search for prefix path in case the user rename a file or directory

the Database include about 1.5-2.5 million records and to use the LIKE is
not possible because the result time.

As an alternative I want to use FTS3 or FTS4 but I think I have a problems
with what I read here: http://www.sqlite.org/fts3.html#section_1_4

And here: http://www.sqlite.org/fts3.html#section_6_3

I need to specify the language to FTS to use it as tokenize but the path
can include multi languages how can I configure the FTS table to use all
languages

How can I tell to FTS to token the path only according to the character \
?



More than that when creating FTS table it creates with the TEMPORARY  key
word. My question is: do I need to create this table each time I run the
data base   (because the temporary word) or for each connections (in case
of multiconnections) or this is a table like all tables I declared and it
stay in the data base even if I restart my PC



Thanks a lot
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS questions

2013-01-12 Thread moti lahiani

Thanks for your reply
Why I care the language: according to the documentation:
A term is a contiguous sequence of eligible characters, where eligible
characters are all alphanumeric characters and all characters with Unicode
codepoint values greater than or equal to 128. All other characters are
discarded when splitting a document into terms. Their only contribution is
to separate adjacent terms.
if the path include folder or file name in France or Hebrew or Arabic  and
the user what to search that word according to above the FTS will not found
it. Am I wrong?

about the backslash/slash if I have a path like this and the user search
for עברית
did the FTS will find it?
c:\folders\video\עברית

about the creation I was confuse it not temporary its virtual so ignore my
question
Thanks




On Sat, Jan 12, 2013 at 5:58 PM, Michael Black mdblac...@yahoo.com wrote:

 I'm not sure I understand your problem.
 Why do you care what language it is?  Aren't you just wanting to tokenize
 on
 backslash?

 Simple way is to replace all spaces in the path with another char (e.g.
 '_')
 then replace all backslashes with a space.
 Then you can just use the default tokenizer and make the same changes on
 any
 user queries.  So you map the user query to your storage format.

 Not sure why you would want to use some stem tokenizer on paths.

 And, of course, you're reinventing the wheel unless you have some special
 purpose in mind.

 http://locate32.cogit.net/

 -Original Message-
 From: sqlite-users-boun...@sqlite.org
 [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of moti lahiani
 Sent: Saturday, January 12, 2013 4:37 AM
 To: sqlite-users@sqlite.org
 Cc: Moti LAHIANI
 Subject: [sqlite] FTS questions

 Hello all

 I'm new with sqlite3 and sql.

 I have data base that include path columns (file system path like c:\bla
 bla\myFiles\1.txt)

 On that columns I need to do

 1)  search for patterns in case the user want to find a file or
 directory

 2)  search for prefix path in case the user rename a file or directory

 the Database include about 1.5-2.5 million records and to use the LIKE is
 not possible because the result time.

 As an alternative I want to use FTS3 or FTS4 but I think I have a problems
 with what I read here: http://www.sqlite.org/fts3.html#section_1_4

 And here: http://www.sqlite.org/fts3.html#section_6_3

 I need to specify the language to FTS to use it as tokenize but the path
 can include multi languages how can I configure the FTS table to use all
 languages

 How can I tell to FTS to token the path only according to the character \
 ?



 More than that when creating FTS table it creates with the TEMPORARY  key
 word. My question is: do I need to create this table each time I run the
 data base   (because the temporary word) or for each connections (in case
 of multiconnections) or this is a table like all tables I declared and it
 stay in the data base even if I restart my PC



 Thanks a lot
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS questions

2013-01-12 Thread Michael Black

Test it yourself:

create virtual test using fts4(context text);
insert into test values ('c:\folders\video\עברית');
select * from test where context match 'עברית';
If you want a partial match add a wildcard
select * from test where context match 'עברית*';

I don't have the codepage running so I can't test it but it sure works for 
English.  Don't see why it wouldn't work for other languages.



-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of moti lahiani
Sent: Saturday, January 12, 2013 10:56 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] FTS questions

Thanks for your reply
Why I care the language: according to the documentation:
A term is a contiguous sequence of eligible characters, where eligible
characters are all alphanumeric characters and all characters with Unicode
codepoint values greater than or equal to 128. All other characters are
discarded when splitting a document into terms. Their only contribution is
to separate adjacent terms.
if the path include folder or file name in France or Hebrew or Arabic  and
the user what to search that word according to above the FTS will not found
it. Am I wrong?

about the backslash/slash if I have a path like this and the user search
for עברית
did the FTS will find it?
c:\folders\video\עברית

about the creation I was confuse it not temporary its virtual so ignore my
question
Thanks




On Sat, Jan 12, 2013 at 5:58 PM, Michael Black mdblac...@yahoo.com wrote:

 I'm not sure I understand your problem.
 Why do you care what language it is?  Aren't you just wanting to tokenize
 on
 backslash?

 Simple way is to replace all spaces in the path with another char (e.g.
 '_')
 then replace all backslashes with a space.
 Then you can just use the default tokenizer and make the same changes on
 any
 user queries.  So you map the user query to your storage format.

 Not sure why you would want to use some stem tokenizer on paths.

 And, of course, you're reinventing the wheel unless you have some special
 purpose in mind.

 http://locate32.cogit.net/

 -Original Message-
 From: sqlite-users-boun...@sqlite.org
 [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of moti lahiani
 Sent: Saturday, January 12, 2013 4:37 AM
 To: sqlite-users@sqlite.org
 Cc: Moti LAHIANI
 Subject: [sqlite] FTS questions

 Hello all

 I'm new with sqlite3 and sql.

 I have data base that include path columns (file system path like c:\bla
 bla\myFiles\1.txt)

 On that columns I need to do

 1)  search for patterns in case the user want to find a file or
 directory

 2)  search for prefix path in case the user rename a file or directory

 the Database include about 1.5-2.5 million records and to use the LIKE is
 not possible because the result time.

 As an alternative I want to use FTS3 or FTS4 but I think I have a problems
 with what I read here: http://www.sqlite.org/fts3.html#section_1_4

 And here: http://www.sqlite.org/fts3.html#section_6_3

 I need to specify the language to FTS to use it as tokenize but the path
 can include multi languages how can I configure the FTS table to use all
 languages

 How can I tell to FTS to token the path only according to the character \
 ?



 More than that when creating FTS table it creates with the TEMPORARY  key
 word. My question is: do I need to create this table each time I run the
 data base   (because the temporary word) or for each connections (in case
 of multiconnections) or this is a table like all tables I declared and it
 stay in the data base even if I restart my PC



 Thanks a lot
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS substring behavior

2012-11-08 Thread Paul Vercellotti



Hi there,

I wanted to clarify if FTS could provide any optimization for substring matches 
like '*ion*' or similar?

That is, does it only scan the token index for matching tokens to locate the 
main table records that contain those tokens, or does it do a full table scan 
of the main table?

The number of unique tokens we have is small compared to the total number of 
records, so if it only scanned the token index it would in theory help.

Thanks
Paul
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS substring behavior

2012-11-08 Thread Dan Kennedy


On 11/09/2012 01:49 AM, Paul Vercellotti wrote:



Hi there,

I wanted to clarify if FTS could provide any optimization for substring matches 
like '*ion*' or similar?


No. I think it will actually search for tokens that start with the 4
ASCII characters *ion if you try that.

Dan.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS substring behavior

2012-11-08 Thread Yongil Jang

How about look at following URL?

https://github.com/jonasfj/trilite

AFAIK, FTS doesn't support substring search.
I also tried to edit FTS to find substring by changing simple tokenizer.
It was worked partially, but not a good solution to use generally.


2012/11/9 Dan Kennedy danielk1...@gmail.com

 On 11/09/2012 01:49 AM, Paul Vercellotti wrote:



 Hi there,

 I wanted to clarify if FTS could provide any optimization for substring
 matches like '*ion*' or similar?


 No. I think it will actually search for tokens that start with the 4
 ASCII characters *ion if you try that.

 Dan.

 __**_
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-**bin/mailman/listinfo/sqlite-**usershttp://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS substring behavior

2012-11-08 Thread Paul Vercellotti



That's a promising project; I hope it reaches maturity.

I assume your modified tokenizer did a similar thing, like tokenizing full 
text search as [full, text, search, ull, ll, l, ext, xt, xt, earch, arch, rch, 
ch, h]?

What worked and what did not work?

Thanks,
Paul



 From: Yongil Jang yongilj...@gmail.com
To: General Discussion of SQLite Database sqlite-users@sqlite.org 
Sent: Thursday, November 8, 2012 2:26 PM
Subject: Re: [sqlite] FTS substring behavior
 
How about look at following URL?

https://github.com/jonasfj/trilite

AFAIK, FTS doesn't support substring search.
I also tried to edit FTS to find substring by changing simple tokenizer.
It was worked partially, but not a good solution to use generally.


2012/11/9 Dan Kennedy danielk1...@gmail.com

 On 11/09/2012 01:49 AM, Paul Vercellotti wrote:



 Hi there,

 I wanted to clarify if FTS could provide any optimization for substring
 matches like '*ion*' or similar?


 No. I think it will actually search for tokens that start with the 4
 ASCII characters *ion if you try that.

 Dan.

 __**_
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-**bin/mailman/listinfo/sqlite-**usershttp://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS substring behavior

2012-11-08 Thread Yongil Jang

A little bit different.
That sentence of full text search is split as [ful, tex, sea, ull, ll, l,
ext, xt, xt, ear, arc, rch, ch, h] with 3 maximum length of key.
Of course, search pattern string length is limited to 3 and query is
changed as follows.

[Example: search 'earch']
SELECT ... FROM [FTS table] WHERE [search fields or table] MATCH 'ear' AND
([search field1] like '%earch%' OR [search field2] like '%earch%' OR ... OR
[search fieldn] like '%earch%')

The reason of ...
 1) 3 characters: To minimize FTS index size and to improve search
performance
 2) Adding like clauses: To refine result set
 2) many likes: To search all columns in FTS table when a table name is
used in MATCH clause.

BR,
Yongil Jang.



2012/11/9 Paul Vercellotti pverce...@yahoo.com



 That's a promising project; I hope it reaches maturity.

 I assume your modified tokenizer did a similar thing, like tokenizing
 full text search as [full, text, search, ull, ll, l, ext, xt, xt, earch,
 arch, rch, ch, h]?

 What worked and what did not work?

 Thanks,
 Paul


 
  From: Yongil Jang yongilj...@gmail.com
 To: General Discussion of SQLite Database sqlite-users@sqlite.org
 Sent: Thursday, November 8, 2012 2:26 PM
 Subject: Re: [sqlite] FTS substring behavior

 How about look at following URL?

 https://github.com/jonasfj/trilite

 AFAIK, FTS doesn't support substring search.
 I also tried to edit FTS to find substring by changing simple tokenizer.
 It was worked partially, but not a good solution to use generally.


 2012/11/9 Dan Kennedy danielk1...@gmail.com

  On 11/09/2012 01:49 AM, Paul Vercellotti wrote:
 
 
 
  Hi there,
 
  I wanted to clarify if FTS could provide any optimization for substring
  matches like '*ion*' or similar?
 
 
  No. I think it will actually search for tokens that start with the 4
  ASCII characters *ion if you try that.
 
  Dan.
 
  __**_
  sqlite-users mailing list
  sqlite-users@sqlite.org
  http://sqlite.org:8080/cgi-**bin/mailman/listinfo/sqlite-**users
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
 
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS returns out of memory when use NEAR and OR

2012-10-24 Thread Vlad Seryakov

Hello

For some time already i noticed that when i use NEAR/1 and OR in one query like
SELECT * FROM search WHERE search MATCH 'tom NEAR/1 hanks or tom hanks'

i get out of memory error. Running this on 16Gb laptop cannot be memory issue 
and the database only has several thousands of records. Investigating the code 
i found one place where in fts3EvalNearTest where it happens:

line 129689 i version 3.7.14.1

nTmp += p-pPhrase-doclist.nList; 
aTmp = sqlite3_malloc(nTmp*2);
if( !aTmp ){ 

Adding the check before the malloc solved the problem
if (nTmp = 0) return res;

Thanks

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS returns out of memory when use NEAR and OR

2012-10-24 Thread Dan Kennedy


On 10/24/2012 11:07 PM, Vlad Seryakov wrote:

Hello

For some time already i noticed that when i use NEAR/1 and OR in one
query like SELECT * FROM search WHERE search MATCH 'tom NEAR/1 hanks
or tom hanks'


Are you able to share the database file that you use to reproduce
this? Thanks.

Dan.





i get out of memory error. Running this on 16Gb laptop cannot be
memory issue and the database only has several thousands of records.
Investigating the code i found one place where in fts3EvalNearTest
where it happens:

line 129689 i version 3.7.14.1

nTmp += p-pPhrase-doclist.nList; aTmp = sqlite3_malloc(nTmp*2); if(
!aTmp ){

Adding the check before the malloc solved the problem if (nTmp= 0)
return res;

Thanks

___ sqlite-users mailing
list sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS: Phrase queries

2012-07-13 Thread Fabian

Ever since I started using FTS extensively, I frequently ran into this
limitation:

  ** TODO: Strangely, it is not possible to associate a column specifier
  ** with a quoted phrase, only with a single token. Not sure if this was
  ** an implementation artifact or an intentional decision when fts3 was
  ** first implemented. Whichever it was, this module duplicates the
  ** limitation.

Is it ever planned to be fixed, because it doesn't seem to break any
backwards compatibilty?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS Example Fails (matchinfo arguments)

2012-06-14 Thread Sergei G

I am running sqlite3 version 3.7.3 on debian.

I run the following commands from fts3.html documentation page:

CREATE VIRTUAL TABLE t1 USING fts4(a, b);
INSERT INTO t1 VALUES('transaction default models default', 'Non
transaction reads');
INSERT INTO t1 VALUES('the default transaction', 'these semantics present');
INSERT INTO t1 VALUES('single request', 'default data');

SELECT matchinfo(t1) FROM t1 WHERE t1 MATCH 'default transaction
these semantics';

SELECT matchinfo(t1, 'ns') FROM t1 WHERE t1 MATCH 'default transaction';

The last line with match info(t1, 'ns') fails with the following error
message:

Error: wrong number of arguments to function matchinfo()

I have originally worked on my own table, but found the same error. So, I
have tried example above and it failed for me in the same way.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS Example Fails (matchinfo arguments)

2012-06-14 Thread Dan Kennedy


On 06/14/2012 01:27 PM, Sergei G wrote:

I am running sqlite3 version 3.7.3 on debian.

I run the following commands from fts3.html documentation page:

CREATE VIRTUAL TABLE t1 USING fts4(a, b);
INSERT INTO t1 VALUES('transaction default models default', 'Non
transaction reads');
INSERT INTO t1 VALUES('the default transaction', 'these semantics present');
INSERT INTO t1 VALUES('single request', 'default data');

SELECT matchinfo(t1) FROM t1 WHERE t1 MATCH 'default transaction
these semantics';

SELECT matchinfo(t1, 'ns') FROM t1 WHERE t1 MATCH 'default transaction';

The last line with match info(t1, 'ns') fails with the following error
message:

Error: wrong number of arguments to function matchinfo()

I have originally worked on my own table, but found the same error. So, I
have tried example above and it failed for me in the same way.


I think the two argument version of matchinfo() is only in 3.7.4 and
newer. Prior to that it only accepted one argument.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS Example Fails (matchinfo arguments)

2012-06-14 Thread Sergei G

Is there a way I can obtain documentation that matches my version?
Online documentation is for the most current version.
I have found that both my hosting provider and debian stable are a bit
behind, so I have to work with what I've got.


Thanks

On Thu, Jun 14, 2012 at 1:16 AM, Dan Kennedy danielk1...@gmail.com wrote:

 On 06/14/2012 01:27 PM, Sergei G wrote:

 I am running sqlite3 version 3.7.3 on debian.

 I run the following commands from fts3.html documentation page:

 CREATE VIRTUAL TABLE t1 USING fts4(a, b);
 INSERT INTO t1 VALUES('transaction default models default', 'Non
 transaction reads');
 INSERT INTO t1 VALUES('the default transaction', 'these semantics
 present');
 INSERT INTO t1 VALUES('single request', 'default data');

 SELECT matchinfo(t1) FROM t1 WHERE t1 MATCH 'default transaction
 these semantics';

 SELECT matchinfo(t1, 'ns') FROM t1 WHERE t1 MATCH 'default transaction';

 The last line with match info(t1, 'ns') fails with the following error
 message:

 Error: wrong number of arguments to function matchinfo()

 I have originally worked on my own table, but found the same error. So, I
 have tried example above and it failed for me in the same way.


 I think the two argument version of matchinfo() is only in 3.7.4 and
 newer. Prior to that it only accepted one argument.
 __**_
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-**bin/mailman/listinfo/sqlite-**usershttp://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS Example Fails (matchinfo arguments)

2012-06-14 Thread Kees Nuyt

On Thu, 14 Jun 2012 13:13:58 -0700, Sergei G sergeig.pub...@gmail.com
wrote:

Is there a way I can obtain documentation that matches my version?
Online documentation is for the most current version.
I have found that both my hosting provider and debian stable are a bit
behind, so I have to work with what I've got.


Thanks

Part of the documentation is partly generated from the sqlite source
code maintained in the fossil [1] source repository.
The remainder is kept in a separate source repository.


You can check out any point on the timelines of both repositories.

Links are at the bottom of :
http://www.sqlite.org/download.html 


[1] http://www.fossil-scm.org/

-- 
Regards,

Kees Nuyt

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS Example Fails (matchinfo arguments)

2012-06-14 Thread Kevin Benson

On Thu, Jun 14, 2012 at 4:27 PM, Kees Nuyt k.n...@zonnet.nl wrote:

 On Thu, 14 Jun 2012 13:13:58 -0700, Sergei G sergeig.pub...@gmail.com
 wrote:

 Is there a way I can obtain documentation that matches my version?
 Online documentation is for the most current version.
 I have found that both my hosting provider and debian stable are a bit
 behind, so I have to work with what I've got.
 
 
 Thanks

 Part of the documentation is partly generated from the sqlite source
 code maintained in the fossil [1] source repository.
 The remainder is kept in a separate source repository.


 You can check out any point on the timelines of both repositories.

 Links are at the bottom of :
 http://www.sqlite.org/download.html


 [1] http://www.fossil-scm.org/

 --
 Regards,

 Kees Nuyt



 And, of course, there's also the Internet Archive:

http://wayback.archive.org/web/*/http://www.sqlite.org

--
   --
  --
 --Ô¿Ô--
K e V i N
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS Example Fails (matchinfo arguments)

2012-06-14 Thread Richard Hipp

On Thu, Jun 14, 2012 at 4:13 PM, Sergei G sergeig.pub...@gmail.com wrote:

 Is there a way I can obtain documentation that matches my version?
 Online documentation is for the most current version.
 I have found that both my hosting provider and debian stable are a bit
 behind, so I have to work with what I've got.



http://www.sqlite.org/sqlite_docs_3_7_3.zip

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS simple tokenizer with custom delimeters

2012-05-06 Thread Jos Groot Lipman

While looking around in the source of the simple tokenizer I found code that
suggests custom delimeters can be specified (I want to exclude the
underscore).
 
 
http://www.sqlite.org/src/artifact/5c98225a53705e5ee34824087478cf477bdb7004?
ln=76-87
 
An indeed:
  CREATE VIRTUAL TABLE ft USING fts3(title, body, tokenize=simple XX
['\ *()./\\=,:;%-?!])
seems to work fine.
 
As far as I can tell this feature is undocumented which means I am not
suppose to use it.
Is this:
- An oversight
- For good reason as it is unstable
- or: because the syntax might change in the near future?
 
Also: I need to include the dummy XX as the delimeters are searched in
argv[1] in stead of argv[0]. I cannot find what the argv[0] is supposed to
do here. Any reason?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Sqlite FTS retrieve inverted index

2012-03-12 Thread Mario Annau

Hello,

unfortunately I have already posted this question on
stackoverflowhttp://stackoverflow.com/questions/9657016/get-inverted-index-from-sqlite-fts-table,
hope that this mailing list is right address.

After I have implemented a full text search function in my application
using Sqlite and FTS tables I would be interested
in a (performant) way of retrieving the FULL inverted index (or large part)
out of my FTS (sub-)table.

In effect - I would need a result table including the terms, docid's and
number of occurences.

I am actually searching for some basic code/examples to read the segdir /
segments table (where the actual index is stored ) and construct my
desired result table (in effect - the inverted index). But any solution
which could retrieve the full (or large part of) my inverted index using
queries including MATCH, MATCHINFO , etc. (sorry, I'm no Sqlite export)
would be highly appreciated!

Best,
mario
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Sqlite FTS retrieve inverted index

2012-03-12 Thread Alexey Pechnikov

See
http://www.sqlite.org/draft/fts3.html#fts4aux

2012/3/13 Mario Annau mario.an...@gmail.com:
 Hello,

 unfortunately I have already posted this question on
 stackoverflowhttp://stackoverflow.com/questions/9657016/get-inverted-index-from-sqlite-fts-table,
 hope that this mailing list is right address.

 After I have implemented a full text search function in my application
 using Sqlite and FTS tables I would be interested
 in a (performant) way of retrieving the FULL inverted index (or large part)
 out of my FTS (sub-)table.

 In effect - I would need a result table including the terms, docid's and
 number of occurences.

 I am actually searching for some basic code/examples to read the segdir /
 segments table (where the actual index is stored ) and construct my
 desired result table (in effect - the inverted index). But any solution
 which could retrieve the full (or large part of) my inverted index using
 queries including MATCH, MATCHINFO , etc. (sorry, I'm no Sqlite export)
 would be highly appreciated!

 Best,
 mario
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



-- 
Best regards, Alexey Pechnikov.
http://pechnikov.tel/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Sqlite FTS retrieve inverted index

2012-03-12 Thread Mario Annau

Hi Alexey,
tha

Am 12. März 2012 22:14 schrieb Alexey Pechnikov pechni...@mobigroup.ru:

 See
 http://www.sqlite.org/draft/fts3.html#fts4aux


have already tried the fts4aux table. however, I would also need
the number of occurrences of each term in each document.

Therefore, like in the docs, not only

*-- The following query returns this data:**--**-- apple   |
*  |  1  |  1**-- apple   |  0  |  1  |  1**-- banana
|  *  |  2  |  2**-- banana  |  0  |  2  |  2**-- cherry
   |  *  |  3  |  3**-- cherry  |  0  |  1  |  1**--
cherry  |  1  |  2  |  2**-- date|  *  |  1  |  2**--
   date|  0  |  1  |  2**-- elderberry  |  *  |  1  |
2**-- elderberry  |  1  |  1  |  1**-- elderberry  |  1  |  1
|  1*

but a result table like this:

   Term|col  |docid| occurences
--
-- apple   |  0  |  1  |  1
-- banana  |  0  |  2  |  1
-- cherry  |  0  |  3  |  1
-- cherry  |  1  |  1  |  1
-- cherry  |  1  |  2  |  1
-- date|  0  |  2  |  2
-- elderberry  |  0  |  3  |  1
-- elderberry  |  1  |  3  |  1

Best,
mario


 2012/3/13 Mario Annau mario.an...@gmail.com:
  Hello,
 
  unfortunately I have already posted this question on
  stackoverflow
 http://stackoverflow.com/questions/9657016/get-inverted-index-from-sqlite-fts-table
 ,
  hope that this mailing list is right address.
 
  After I have implemented a full text search function in my application
  using Sqlite and FTS tables I would be interested
  in a (performant) way of retrieving the FULL inverted index (or large
 part)
  out of my FTS (sub-)table.
 
  In effect - I would need a result table including the terms, docid's and
  number of occurences.
 
  I am actually searching for some basic code/examples to read the segdir /
  segments table (where the actual index is stored ) and construct my
  desired result table (in effect - the inverted index). But any solution
  which could retrieve the full (or large part of) my inverted index using
  queries including MATCH, MATCHINFO , etc. (sorry, I'm no Sqlite export)
  would be highly appreciated!
 
  Best,
  mario
  ___
  sqlite-users mailing list
  sqlite-users@sqlite.org
  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



 --
 Best regards, Alexey Pechnikov.
 http://pechnikov.tel/
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS simple tokenizer

2012-02-28 Thread Dan Kennedy


On 02/28/2012 12:09 AM, Jos Groot Lipman wrote:

It was reported before (and not solved)
http://www.mail-archive.com/sqlite-users@sqlite.org/msg55959.html


The document sources are updated now. So the fix will appear on
the website next time it is regenerated.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS simple tokenizer

2012-02-28 Thread Matt Young

Using the _ character to separate words is an informal  language standard,
s in: method_do_this...

On Tue, Feb 28, 2012 at 12:40 AM, Dan Kennedy danielk1...@gmail.com wrote:

 On 02/28/2012 12:09 AM, Jos Groot Lipman wrote:

 It was reported before (and not solved)
 http://www.mail-archive.com/**sqlite-users@sqlite.org/**msg55959.htmlhttp://www.mail-archive.com/sqlite-users@sqlite.org/msg55959.html


 The document sources are updated now. So the fix will appear on
 the website next time it is regenerated.
 __**_
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-**bin/mailman/listinfo/sqlite-**usershttp://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS simple tokenizer

2012-02-27 Thread Hamish Allan

Thanks Dan. Have just checked how to report bug, and apparently we already have 
:)

Please excuse the brevity -- sent from my phone

On 27 Feb 2012, at 07:06, Dan Kennedy danielk1...@gmail.com wrote:

 On 02/27/2012 05:59 AM, Hamish Allan wrote:
 The docs for the simple tokenizer
 (http://www.sqlite.org/fts3.html#tokenizer) say:
 
 A term is a contiguous sequence of eligible characters, where
 eligible characters are all alphanumeric characters, the _
 character, and all characters with UTF codepoints greater than or
 equal to 128.
 
 If I do:
 
 CREATE VIRTUAL TABLE test USING fts3();
 INSERT INTO test (content) VALUES ('hello_world');
 
 SELECT * FROM test WHERE content MATCH 'orld';
 SELECT * FROM test WHERE content MATCH 'world';
 
 I get no match for the first query, because it doesn't match a term,
 but I get a match for the second, whereas according to my reading of
 the docs world shouldn't be a term because the underscore character
 shouldn't be considered a term break.
 
 Can anyone please help me understand this behaviour?
 
 Documentation bug. Eligible characters are just alphanumerics and
 UTF codepoints greater than 128.
 
 Dan.
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS simple tokenizer

2012-02-27 Thread Jos Groot Lipman

It was reported before (and not solved)
http://www.mail-archive.com/sqlite-users@sqlite.org/msg55959.html 

 -Original Message-
 From: sqlite-users-boun...@sqlite.org 
 [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Hamish Allan
 Sent: maandag 27 februari 2012 11:27
 To: General Discussion of SQLite Database
 Cc: sqlite-users@sqlite.org
 Subject: Re: [sqlite] FTS simple tokenizer

 Thanks Dan. Have just checked how to report bug, and 
 apparently we already have :)

 Please excuse the brevity -- sent from my phone

 On 27 Feb 2012, at 07:06, Dan Kennedy danielk1...@gmail.com wrote:

  On 02/27/2012 05:59 AM, Hamish Allan wrote:
  The docs for the simple tokenizer
  (http://www.sqlite.org/fts3.html#tokenizer) say:

  A term is a contiguous sequence of eligible characters, where 
  eligible characters are all alphanumeric characters, the _
  character, and all characters with UTF codepoints greater than or 
  equal to 128.

  If I do:

  CREATE VIRTUAL TABLE test USING fts3(); INSERT INTO test (content) 
  VALUES ('hello_world');

  SELECT * FROM test WHERE content MATCH 'orld'; SELECT * FROM test 
  WHERE content MATCH 'world';

  I get no match for the first query, because it doesn't 
 match a term, 
  but I get a match for the second, whereas according to my 
 reading of 
  the docs world shouldn't be a term because the 
 underscore character 
  shouldn't be considered a term break.

  Can anyone please help me understand this behaviour?

  Documentation bug. Eligible characters are just 
 alphanumerics and UTF 
  codepoints greater than 128.

  Dan.
  ___
  sqlite-users mailing list
  sqlite-users@sqlite.org
  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS simple tokenizer

2012-02-26 Thread Hamish Allan

The docs for the simple tokenizer
(http://www.sqlite.org/fts3.html#tokenizer) say:

A term is a contiguous sequence of eligible characters, where
eligible characters are all alphanumeric characters, the _
character, and all characters with UTF codepoints greater than or
equal to 128.

If I do:

CREATE VIRTUAL TABLE test USING fts3();
INSERT INTO test (content) VALUES ('hello_world');

SELECT * FROM test WHERE content MATCH 'orld';
SELECT * FROM test WHERE content MATCH 'world';

I get no match for the first query, because it doesn't match a term,
but I get a match for the second, whereas according to my reading of
the docs world shouldn't be a term because the underscore character
shouldn't be considered a term break.

Can anyone please help me understand this behaviour?

Thanks,
Hamish
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS simple tokenizer

2012-02-26 Thread Dan Kennedy


On 02/27/2012 05:59 AM, Hamish Allan wrote:

The docs for the simple tokenizer
(http://www.sqlite.org/fts3.html#tokenizer) say:

A term is a contiguous sequence of eligible characters, where
eligible characters are all alphanumeric characters, the _
character, and all characters with UTF codepoints greater than or
equal to 128.

If I do:

CREATE VIRTUAL TABLE test USING fts3();
INSERT INTO test (content) VALUES ('hello_world');

SELECT * FROM test WHERE content MATCH 'orld';
SELECT * FROM test WHERE content MATCH 'world';

I get no match for the first query, because it doesn't match a term,
but I get a match for the second, whereas according to my reading of
the docs world shouldn't be a term because the underscore character
shouldn't be considered a term break.

Can anyone please help me understand this behaviour?


Documentation bug. Eligible characters are just alphanumerics and
UTF codepoints greater than 128.

Dan.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS malformed MATCH expression: [( a* OR b*)]

2012-02-03 Thread Joe Mistachkin


The native SQLite code bundled with System.Data.SQLite is not compiled
with the SQLITE_ENABLE_FTS3_PARENTHESIS option, which is required for
the expression you are asking about to work properly.

Of course, it can always be recompiled if you have access to Visual C++,
by editing one of the following files and recompiling the SQLite.Interop
project:

root/SQLite.Interop/props/sqlite3.vsprops (for VC++ 2008)
root/SQLite.Interop/props/sqlite3.props (for VC++ 2010)

Where root is the root of the source tree for System.Data.SQLite.

--
Joe Mistachkin

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS malformed MATCH expression: [( a* OR b*)]

2012-02-03 Thread Stefan Rogin

Thank you,

I'll try to recompile.
I looked in the Navicat folder and it had 2 sqlite dlls(sqlite and
sqlite3), could those be of any good?
În data de 03.02.2012 16:52, Joe Mistachkin sql...@mistachkin.com a
scris:


 The native SQLite code bundled with System.Data.SQLite is not compiled
 with the SQLITE_ENABLE_FTS3_PARENTHESIS option, which is required for
 the expression you are asking about to work properly.

 Of course, it can always be recompiled if you have access to Visual C++,
 by editing one of the following files and recompiling the SQLite.Interop
 project:

root/SQLite.Interop/props/sqlite3.vsprops (for VC++ 2008)
root/SQLite.Interop/props/sqlite3.props (for VC++ 2010)

 Where root is the root of the source tree for System.Data.SQLite.

 --
 Joe Mistachkin

 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS malformed MATCH expression: [( a* OR b*)]

2012-02-03 Thread Stefan Rogin

Hi,

I tried to build with all default and it gave me this:

Error 5 error LNK1181: cannot open input file
'root\SQLite.Interop\..\bin\2010\ReleaseModule\bin\System.Data.SQLite.netmodule'
root \SQLite.Interop\LINK SQLite.Interop.2010

On Fri, Feb 3, 2012 at 5:32 PM, Stefan Rogin stefan.rogin2...@gmail.comwrote:

 Thank you,

 I'll try to recompile.
 I looked in the Navicat folder and it had 2 sqlite dlls(sqlite and
 sqlite3), could those be of any good?
 În data de 03.02.2012 16:52, Joe Mistachkin sql...@mistachkin.com a
 scris:


 The native SQLite code bundled with System.Data.SQLite is not compiled
 with the SQLITE_ENABLE_FTS3_PARENTHESIS option, which is required for
 the expression you are asking about to work properly.

 Of course, it can always be recompiled if you have access to Visual C++,
 by editing one of the following files and recompiling the SQLite.Interop
 project:

root/SQLite.Interop/props/sqlite3.vsprops (for VC++ 2008)
root/SQLite.Interop/props/sqlite3.props (for VC++ 2010)

 Where root is the root of the source tree for System.Data.SQLite.

 --
 Joe Mistachkin

 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




-- 

*Stefan Rogin*
Webdesigner
Tel: +40.769.622.178
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS malformed MATCH expression: [( a* OR b*)]

2012-02-03 Thread Stefan Rogin

Forget the last question.
I've managed to get passed it :), I didn't open the whole solution file,
just the  SQLite.Interop.2010 project,

I've seen that it's set on .Net 4
How can I set it to 3.5 and are there any drawbacks?

Anyway, thanks for the quick reply. +1 for the support

On Fri, Feb 3, 2012 at 9:41 PM, Stefan Rogin stefan.rogin2...@gmail.comwrote:

 Hi,

 I tried to build with all default and it gave me this:

 Error 5 error LNK1181: cannot open input file
 'root\SQLite.Interop\..\bin\2010\ReleaseModule\bin\System.Data.SQLite.netmodule'
 root \SQLite.Interop\LINK SQLite.Interop.2010

 On Fri, Feb 3, 2012 at 5:32 PM, Stefan Rogin 
 stefan.rogin2...@gmail.comwrote:

 Thank you,

 I'll try to recompile.
 I looked in the Navicat folder and it had 2 sqlite dlls(sqlite and
 sqlite3), could those be of any good?
 În data de 03.02.2012 16:52, Joe Mistachkin sql...@mistachkin.com a
 scris:


 The native SQLite code bundled with System.Data.SQLite is not compiled
 with the SQLITE_ENABLE_FTS3_PARENTHESIS option, which is required for
 the expression you are asking about to work properly.

 Of course, it can always be recompiled if you have access to Visual C++,
 by editing one of the following files and recompiling the SQLite.Interop
 project:

root/SQLite.Interop/props/sqlite3.vsprops (for VC++ 2008)
root/SQLite.Interop/props/sqlite3.props (for VC++ 2010)

 Where root is the root of the source tree for System.Data.SQLite.

 --
 Joe Mistachkin

 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




 --

 *Stefan Rogin*
 Webdesigner
 Tel: +40.769.622.178




-- 

*Stefan Rogin*
Webdesigner
Tel: +40.769.622.178
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS malformed MATCH expression: [( a* OR b*)]

2012-02-03 Thread Joe Mistachkin


Stefan Rogin wrote:

 I've seen that it's set on .Net 4


The VS 2010 project uses .NET 4, the VS 2008 project uses .NET 3.5.


 How can I set it to 3.5 and are there any drawbacks?


Depends on what environment(s) you are going to use your application in.
There are trade-offs; however, that is why we provide support for both.

--
Joe Mistachkin

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS malformed MATCH expression: [( a* OR b*)]

2012-02-02 Thread Stefan Rogin

Hi,

I've encountered some issues when using *FTS4 MATCH* on a database, like
the one mentioned in the subject (*malformed MATCH expression: [( a* OR
b*)]*).
From what I discovered it is triggered by using a quote at the beginning of
a parenthesis, if I write *(b* OR a*)* it works just fine.
What I found interesting is that if I use Navicat with the SQLite database,
the query above works just fine.
Also there is support for  NOT  opposed to  -  supported
by System.Data.SQLite.

*this is the syntax I used to get the data:*

DataTable dt = new DataTable();

using (SQLiteCommand cmd = mySQLiteConn.CreateCommand())
{
cmd.CommandType = CommandType.Text;
cmd.CommandText = sql;
SQLiteDataAdapter sqlDa = new SQLiteDataAdapter(cmd);
sqlDa.Fill(dt);
}

*exception detail:*

System.Data.SQLite.SQLiteException was unhandled
  Message=SQLite error
malformed MATCH expression: [(a* OR b*)]
  Source=System.Data.SQLite
  ErrorCode=-2147467259
  StackTrace:
   at System.Data.SQLite.SQLite3.Reset(SQLiteStatement stmt)
   at System.Data.SQLite.SQLite3.Step(SQLiteStatement stmt)
   at System.Data.SQLite.SQLiteDataReader.NextResult()
   at System.Data.SQLite.SQLiteDataReader..ctor(SQLiteCommand cmd,
CommandBehavior behave)
   at System.Data.SQLite.SQLiteCommand.ExecuteReader(CommandBehavior
behavior)
   at
System.Data.SQLite.SQLiteCommand.ExecuteDbDataReader(CommandBehavior
behavior)
   at
System.Data.Common.DbCommand.System.Data.IDbCommand.ExecuteReader(CommandBehavior
behavior)
   at System.Data.Common.DbDataAdapter.FillInternal(DataSet dataset,
DataTable[] datatables, Int32 startRecord, Int32 maxRecords, String
srcTable, IDbCommand command, CommandBehavior behavior)
   at System.Data.Common.DbDataAdapter.Fill(DataTable[] dataTables,
Int32 startRecord, Int32 maxRecords, IDbCommand command, CommandBehavior
behavior)
   at System.Data.Common.DbDataAdapter.Fill(DataTable dataTable)
   at SQLite_Take2.Form1.ftsSearch() in D:\c#\SQLite Take2\Main.cs:line
607
   at SQLite_Take2.Form1.cmdSearch_Click(Object sender, EventArgs e) in
D:\c#\SQLite Take2\Main.cs:line 340
   at System.Windows.Forms.Control.OnClick(EventArgs e)
   at System.Windows.Forms.Button.OnClick(EventArgs e)
   at System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)
   at System.Windows.Forms.Control.WmMouseUp(Message m, MouseButtons
button, Int32 clicks)
   at System.Windows.Forms.Control.WndProc(Message m)
   at System.Windows.Forms.ButtonBase.WndProc(Message m)
   at System.Windows.Forms.Button.WndProc(Message m)
   at
System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message m)
   at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message
m)
   at System.Windows.Forms.NativeWindow.DebuggableCallback(IntPtr hWnd,
Int32 msg, IntPtr wparam, IntPtr lparam)
   at System.Windows.Forms.UnsafeNativeMethods.DispatchMessageW(MSG
msg)
   at
System.Windows.Forms.Application.ComponentManager.System.Windows.Forms.UnsafeNativeMethods.IMsoComponentManager.FPushMessageLoop(Int32
dwComponentID, Int32 reason, Int32 pvLoopData)
   at
System.Windows.Forms.Application.ThreadContext.RunMessageLoopInner(Int32
reason, ApplicationContext context)
   at
System.Windows.Forms.Application.ThreadContext.RunMessageLoop(Int32 reason,
ApplicationContext context)
   at System.Windows.Forms.Application.Run(Form mainForm)
   at SQLite_Take2.Program.Main() in D:\c#\SQLite Take2\Program.cs:line
18
   at System.AppDomain._nExecuteAssembly(Assembly assembly, String[]
args)
   at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence
assemblySecurity, String[] args)
   at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
   at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
   at System.Threading.ExecutionContext.Run(ExecutionContext
executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()
  InnerException:


*I use:*
Windows 7 , 64bit
Visual Studio 2010
SQLite Server version : 3.7.7.1
Connection string: Data Source=.\temp.db;UTF8Encoding=True;Version=3
System.Data.SQLite.dll version : 1.0.74.0
-- 

*Stefan Rogin*
Webdesigner
Tel: +40.769.622.178
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] [FTS] Executing Sql statements inside a custom tokenizer

2012-01-03 Thread Abhinav Upadhyay

Hi,

I would like to build up a table of all the unique words occurring in
my corpus (for spelling suggestion feature). Presently I am using the
Porter stemming tokenizer and I would not like to stop using the
stemmer at any cost. Although if I was not using the Porter stemmer
then I could easily obtain the list of unique words in the corpus
using the FTS4Aux module. But using the stemmer means that all the
words are stored in the index in their stem form which is not
desirable for building a dictionary of proper English words.

One solution is to use a custom tokenizer. I was thinking of using the
default Porter tokenizer supplied with Sqlite and adding some bits of
code to store the token in a separate table before stemming it down.
But I am not sure if it is ok to access or modify the database using
Sql statements inside a tokenizer. Now that I think of it, the
tokenizer code is also executed when an SQL query is performed against
the FTS table (when performing search), at which time I don't want my
dictionary building code to execute. So perhaps this is not a good
idea.

What other options do I have ?

Thanks
Abhinav
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] [FTS] Executing Sql statements inside a custom tokenizer

2012-01-03 Thread Petite Abeille


On Jan 3, 2012, at 8:30 PM, Abhinav Upadhyay wrote:

 What other options do I have ?

Two FTS tables? One with the Porter stemmer, for search, one without, to build 
the auxiliary tables? 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] [FTS] Executing Sql statements inside a custom tokenizer

2012-01-03 Thread Abhinav Upadhyay

 Two FTS tables? One with the Porter stemmer, for search, one without, to 
 build the auxiliary tables?

Yeah, that is the last option, if nothing else works. For a small set
of documents the extra processing time might be ok but for a larger
set of documents building the FTS tables twice might be a bit taxing.

I think I will first try a custom tokenizer. So is it ok to execute
SQL statements from within the tokenizer ? It would have been great if
there was some way to determine whether the tokenizer code is being
executed for indexing the documents or for searching the index.

Thanks
Abhinav
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Fts */or. Inconsistencies

2011-12-12 Thread Ephraim Stevens

SQLite Gurus,

In SQLIte FTS3/4, does the '*' (wildcard expansion character) discriminate
between alphanumeric characters vs non-alpha numeric characters when
matching? I have two test cases below which causes me to believe that it
does. Also, the OR operator appears to fail when matching against a
combination of alphanumerics and non-alphanumerics.

I'm using a custom tokenizer in each scenario (yes it works and the proof
is enclosed). In the first dataset, the data was tokenized such that any
alphanumeric character qualifies as part of a token.

In the second dataset, the data was tokenized such that anything other than
a semicolon qualifies as part of a token.\

The issues I'm raising is that 1) the '*' expansion doesn't seem to work
for a alphanumeric/non-alphanumeric token matches nor does the OR operator.
I haven't figured out what I'm missing here. I've been banging my head all
morning against tihs. I appreciate any help.

I've listed a working and non working example below since a comparison
should clarify best:

[THIS WORKS FINE]

MY DATASET:

STATE|NAMES

---

maryland|fred,louis,jenny

virginia|ruth,greg,denise

maine|richard,norman,willis

TOKENIZER CRITERIA USED: = any word characters (\w+)

(comment: all these names should be split into regular tokens and they were
as shown below)

FULL TEXT INDEX:

sqlite select * from ft_terms;

term col documents occurrences

-- -- -- ---

fred * 1 1

fred 1 1 1

jenny * 1 1

jenny 1 1 1

louis * 1 1

louis 1 1 1

maryland * 1 1

maryland 0 1 1

ruth * 1 1

ruth 1 1 1

virginia * 1 1

virginia 0 1 1

QUERY:

select * from word where word match 'mary* jen*'

RETURNS:

maryland|fred,louis,jenny

QUERY:

select * from word where word match 'mary* OR v*'

RETURNS:

maryland|fred,louis,jenny

virginia|ruth,greg,denise

All the above behaves as expected. Now lets introduce some non-alphanumerics

--

[THIS DOESN'T WORK]

DATASET:

ROWSET|PAIR

--

1 A=15;B=16;C=38

2 D=15;E=25;F=16

TOKENIZER CRITERIA USED: = any character that is NOT a semicolon ([^;]+)

(comment: all these PAIR values should be tokenized by semicolon and they
were as shown below)

FULL TEXT INDEX:

term col documents occurrences

-- -- -- ---

1 * 1 1

1 0 1 1

2 * 1 1

2 0 1 1

A=15 * 1 1

A=15 1 1 1

B=16 * 1 1

B=16 1 1 1

C=38 * 1 1

C=38 1 1 1

D=15 * 1 1

D=15 1 1 1

E=25 * 1 1

E=25 1 1 1

F=16 * 1 1

F=16 1 1 1

QUERY1:

select * from NUMMY where NUMMY MATCH 'A=* OR D=*'

RETURNS:

(nothing)

comment: Should have returned ROWSET 1 and 2 (refer to above dataset)

QUERY2:

select * from NUMMY where NUMMY MATCH 'A* C*'

RETURNS:

(nothing)

comment: Should have returned ROWSET 1 (refer to above dataset)

--HOWEVER SPECIFYING THE FULL TOKEN WORKS--

QUERY3:

select * from NUMMY where NUMMY MATCH 'A=15'

RETURNS:

1|A=15;B=16;C=38

QUERY4:

select * from NUMMY where NUMMY MATCH 'A=15'

RETURNS:

1|A=15;B=16;C=38

QUERY5:

select * from NUMMY where NUMMY MATCH 'E=25'

RETURNS:

2|D=15;E=25;F=16

--THIS SUCCESS IS SHORTLIVED--

QUERY6:

select * from NUMMY where NUMMY MATCH 'E=25 OR B=16'

RETURNS:

(nothing)

This query should have returned rowset 1 and two as this was an OR query
and both creiteria are met.

In summary, the wildcard expansion as well as the OR operator seems not to
work in the second example. There only main difference between the two data
sets in that the first set is composed of alpha characters only and the
second is a combination of alphanumeric and non-aplhanumerica characters.
The (*) expansion character is not matching these. A match only occurs when
you specify the full token.

Thanks for your time in looking at this issue.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Fts */or. Inconsistencies

2011-12-12 Thread Dan Kennedy


On 12/13/2011 02:29 AM, Ephraim Stevens wrote:

I'm using a custom tokenizer in each scenario (yes it works and the proof
is enclosed). In the first dataset, the data was tokenized such that any
alphanumeric character qualifies as part of a token.

In the second dataset, the data was tokenized such that anything other than
a semicolon qualifies as part of a token.\

The issues I'm raising is that 1) the '*' expansion doesn't seem to work
for a alphanumeric/non-alphanumeric token matches nor does the OR operator.
I haven't figured out what I'm missing here. I've been banging my head all
morning against tihs. I appreciate any help.


Do your custom tokenizers allow whitespace or * characters
to be part of tokens? If so, try changing them so that they
do not.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS Tokenizer (separator)

2011-12-09 Thread Ephraim Stevens

Greetings All,

From section seven of the FTS3/FTS4 documentation:

A term is a contiguous sequence of eligible characters, where eligible
characters are all alphanumeric characters, the _ character, and all
characters with UTF codepoints greater than or equal to 128. All other
characters are discarded when splitting a document into terms. Their only
contribution is to separate adjacent terms.


Is there a way to modify/control this behavior?  I would like the equal
sign ('=') to be treated with the same designation as an alpha numeric
character. Currently, the equal sign acts as a separator.

Thanks in advance for any suggestions/help you provide.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS Tokenizer (separator)

2011-12-09 Thread Richard Hipp

On Fri, Dec 9, 2011 at 6:48 AM, Ephraim Stevens
ephraim.stev...@gmail.comwrote:

 Greetings All,

 From section seven of the FTS3/FTS4 documentation:

 A term is a contiguous sequence of eligible characters, where eligible
 characters are all alphanumeric characters, the _ character, and all
 characters with UTF codepoints greater than or equal to 128. All other
 characters are discarded when splitting a document into terms. Their only
 contribution is to separate adjacent terms.


 Is there a way to modify/control this behavior?  I would like the equal
 sign ('=') to be treated with the same designation as an alpha numeric
 character. Currently, the equal sign acts as a separator.

 Thanks in advance for any suggestions/help you provide.


Create your own tokenizer.  http://www.sqlite.org/fts3.html#section_7_1


 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS: Phrase queries

2011-11-27 Thread Fabian

2011/11/14 nobre rafael.ro...@novaprolink.com.br


 Comment from the source:

  ** TODO: Strangely, it is not possible to associate a column specifier
  ** with a quoted phrase, only with a single token. Not sure if this was
  ** an implementation artifact or an intentional decision when fts3 was
  ** first implemented. Whichever it was, this module duplicates the
  ** limitation.

 So, seems its really not possible
 Rafael


Thanks, I hope this limitation will be lifted someday. Fixing it will not
break any existing queries and is fully backwards compatible, so I dont
understand why FTS4 duplicates that (faulty) behaviour.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS: Phrase queries

2011-11-14 Thread nobre


Comment from the source:

  ** TODO: Strangely, it is not possible to associate a column specifier
  ** with a quoted phrase, only with a single token. Not sure if this was
  ** an implementation artifact or an intentional decision when fts3 was
  ** first implemented. Whichever it was, this module duplicates the 
  ** limitation.

So, seems its really not possible
Rafael


Fabian-40 wrote:
 
 When I have a basic FTS query that needs to be restricted to a column, I
 can write it in two ways:
 
 1.) WHERE column MATCH 'apple'
 2.) WHERE table MATCH 'column:apple'
 
 But when I have a phrase query, I can only write it in one way:
 
 1.) WHERE column MATCH 'apple juice'
 
 The problem is that when I want to combine the queries (search for 'apple'
 in column1 and for apple juice in column2) i cannot write the query
 like:
 
 WHERE column1 MATCH 'apple' AND column2 MATCH 'apple juice'
 
 Nor can I write it like:
 
 WHERE table MATCH 'column1:apple column2:apple juice'
 
 So this fairly simple query, seems impossible in FTS? Or does anyone know
 how to workaround this (without doing two seperate queries)?
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
 
 

-- 
View this message in context: 
http://old.nabble.com/FTS%3A-Phrase-queries-tp32834649p32839669.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS: Phrase queries

2011-11-13 Thread Fabian

When I have a basic FTS query that needs to be restricted to a column, I
can write it in two ways:

1.) WHERE column MATCH 'apple'
2.) WHERE table MATCH 'column:apple'

But when I have a phrase query, I can only write it in one way:

1.) WHERE column MATCH 'apple juice'

The problem is that when I want to combine the queries (search for 'apple'
in column1 and for apple juice in column2) i cannot write the query like:

WHERE column1 MATCH 'apple' AND column2 MATCH 'apple juice'

Nor can I write it like:

WHERE table MATCH 'column1:apple column2:apple juice'

So this fairly simple query, seems impossible in FTS? Or does anyone know
how to workaround this (without doing two seperate queries)?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS vs INDEX

2011-10-19 Thread Fabian

Did anyone do some benchmarks how the insert-speed of FTS compares to a TEXT
INDEX column? I don't need many of the extra features of FTS, because I
always need to look up rows by prefix or exact match, and both can be
implemented efficiently via TEXT INDEX too. But if the overhead is
comparable, I'd rather use FTS.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Black, Michael (IS)

I recently benchmarked this...FTS4 has a prefix option that can make it 
slightly faster than TEXT.  Other than that it's about the same speed.

http://www.mail-archive.com/sqlite-users@sqlite.org/msg64591.html



The older part of the thread has the benchmark data



Michael D. Black

Senior Scientist

Advanced Analytics Directorate

Advanced GEOINT Solutions Operating Unit

Northrop Grumman Information Systems


From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Fabian [fabianpi...@gmail.com]
Sent: Wednesday, October 19, 2011 9:20 AM
To: General Discussion of SQLite Database
Subject: EXT :[sqlite] FTS vs INDEX

Did anyone do some benchmarks how the insert-speed of FTS compares to a TEXT
INDEX column? I don't need many of the extra features of FTS, because I
always need to look up rows by prefix or exact match, and both can be
implemented efficiently via TEXT INDEX too. But if the overhead is
comparable, I'd rather use FTS.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Fabian

Very interesting benchmarks! However it seems to focus mainly on the speed
of SELECT queries, and the total size of the resulting database on disk. But
my main concern is about the speed of INSERT queries vs normal tables. Any
chance you compared that too?

2011/10/19 Black, Michael (IS) michael.bla...@ngc.com

 I recently benchmarked this...FTS4 has a prefix option that can make it
 slightly faster than TEXT.  Other than that it's about the same speed.

 http://www.mail-archive.com/sqlite-users@sqlite.org/msg64591.html



 The older part of the thread has the benchmark data



 Michael D. Black

 Senior Scientist

 Advanced Analytics Directorate

 Advanced GEOINT Solutions Operating Unit

 Northrop Grumman Information Systems

 
 From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on
 behalf of Fabian [fabianpi...@gmail.com]
 Sent: Wednesday, October 19, 2011 9:20 AM
 To: General Discussion of SQLite Database
 Subject: EXT :[sqlite] FTS vs INDEX

 Did anyone do some benchmarks how the insert-speed of FTS compares to a
 TEXT
 INDEX column? I don't need many of the extra features of FTS, because I
 always need to look up rows by prefix or exact match, and both can be
 implemented efficiently via TEXT INDEX too. But if the overhead is
 comparable, I'd rather use FTS.
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Alexey Pechnikov

FTS use index multi-tree and de-facto has _no_ insert speed degradation.
I did do test for 400+ millions of records.
With b-tree index there is insert speed degradation:
http://geomapx.blogspot.com/2010/04/sqlite-index-degradation-tests.html
http://geomapx.blogspot.com/search?q=index+speed

So FTS as hash-index is nice.

2011/10/19 Fabian fabianpi...@gmail.com:
 Did anyone do some benchmarks how the insert-speed of FTS compares to a TEXT
 INDEX column? I don't need many of the extra features of FTS, because I
 always need to look up rows by prefix or exact match, and both can be
 implemented efficiently via TEXT INDEX too. But if the overhead is
 comparable, I'd rather use FTS.
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




-- 
Best regards, Alexey Pechnikov.
http://pechnikov.tel/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Fabian

2011/10/19 Alexey Pechnikov pechni...@mobigroup.ru

 FTS use index multi-tree and de-facto has _no_ insert speed degradation.


Thanks, that's good to hear! It makes me wonder why SQLite doesn't use that
same multi-tree mechanism for regular indexes, but that's a whole different
question.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Alexey Pechnikov

2011/10/19 Fabian fabianpi...@gmail.com:
 Thanks, that's good to hear! It makes me wonder why SQLite doesn't use that
 same multi-tree mechanism for regular indexes, but that's a whole different
 question.

It's impossible with SQLite3 database format. May be SQLite4 will be
support it :)

-- 
Best regards, Alexey Pechnikov.
http://pechnikov.tel/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Black, Michael (IS)

Nope -- didn't note the insert speed on that test.



Why don't you take my benchmark data and test it yourself?  Then post the 
results.



The saying your mileage may vary comes to mind...





Michael D. Black

Senior Scientist

Advanced Analytics Directorate

Advanced GEOINT Solutions Operating Unit

Northrop Grumman Information Systems


From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Fabian [fabianpi...@gmail.com]
Sent: Wednesday, October 19, 2011 9:44 AM
To: General Discussion of SQLite Database
Subject: EXT :Re: [sqlite] FTS vs INDEX

Very interesting benchmarks! However it seems to focus mainly on the speed
of SELECT queries, and the total size of the resulting database on disk. But
my main concern is about the speed of INSERT queries vs normal tables. Any
chance you compared that too?

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Scott Hess

On Wed, Oct 19, 2011 at 7:56 AM, Fabian fabianpi...@gmail.com wrote:
 2011/10/19 Alexey Pechnikov pechni...@mobigroup.ru
 FTS use index multi-tree and de-facto has _no_ insert speed degradation.

 Thanks, that's good to hear! It makes me wonder why SQLite doesn't use that
 same multi-tree mechanism for regular indexes, but that's a whole different
 question.

To be clear, how it works is that new insertions are batched into a
new index tree, with index trees periodically aggregated to keep
selection efficient and to keep the size contained.  So while the
speed per insert should remain pretty stable constant, periodically an
insert will require index maintenance, so that insert will be slower.
If you have a lot of documents (or a small page cache) these
maintenance events can get pretty expensive relative to the cost of a
non-maintenance insert.  So it's not a clear-cut win, but it probably
would be interesting as an alternative sort of index for some tables.

-scott
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Fabian

2011/10/19 Scott Hess sh...@google.com


 To be clear, how it works is that new insertions are batched into a
 new index tree, with index trees periodically aggregated to keep
 selection efficient and to keep the size contained.  So while the
 speed per insert should remain pretty stable constant, periodically an
 insert will require index maintenance, so that insert will be slower.
 If you have a lot of documents (or a small page cache) these
 maintenance events can get pretty expensive relative to the cost of a
 non-maintenance insert.  So it's not a clear-cut win, but it probably
 would be interesting as an alternative sort of index for some tables.


I always do inserts in batches of 100.000 rows, and after each batch I
manually merge the b-trees using:

 INSERT INTO table(table) VALUES('optimize');

Is there a possibility that it will do automatic maintenance half-way during
a batch? Or will it always wait untill the transaction is finished?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS: Reduce tokens

2011-10-19 Thread Fabian

Using the default tokenizer, everything that is not an alphanumeric
character or an underscore, will generate a new token.

I have a lot of columns that contains e-mail addresses or URL's, and most of
them have characters like '.', '@' and '/'. Is there a simple way to make
FTS see them as one single token, instead of splitting those strings into
many small ones? I know it's possible to develop a custom tokenizer, but
that's way over my head I'm afraid :)
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Scott Hess

On Wed, Oct 19, 2011 at 12:50 PM, Fabian fabianpi...@gmail.com wrote:
 2011/10/19 Scott Hess sh...@google.com
 To be clear, how it works is that new insertions are batched into a
 new index tree, with index trees periodically aggregated to keep
 selection efficient and to keep the size contained.  So while the
 speed per insert should remain pretty stable constant, periodically an
 insert will require index maintenance, so that insert will be slower.
 If you have a lot of documents (or a small page cache) these
 maintenance events can get pretty expensive relative to the cost of a
 non-maintenance insert.  So it's not a clear-cut win, but it probably
 would be interesting as an alternative sort of index for some tables.

 I always do inserts in batches of 100.000 rows, and after each batch I
 manually merge the b-trees using:

  INSERT INTO table(table) VALUES('optimize');

 Is there a possibility that it will do automatic maintenance half-way during
 a batch? Or will it always wait untill the transaction is finished?

It does it when it does it, in fact you're probably getting some small
merges during this process already.  If you're doing your batch
inserts within a surrounding transaction, and are inserting documents
by ascending docid (or letting the system choose docid), it can buffer
up many updates in memory before flushing them to disk, which is
pretty efficient.  Inserting 100,000 documents this way will probably
not hit any very large merges, unless your documents tend to have a
very large number of unique terms.

-scott
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Alexey Pechnikov

2011/10/19 Fabian fabianpi...@gmail.com:
 I always do inserts in batches of 100.000 rows, and after each batch I
 manually merge the b-trees using:

  INSERT INTO table(table) VALUES('optimize');

 Is there a possibility that it will do automatic maintenance half-way during
 a batch? Or will it always wait untill the transaction is finished?

I think you are victim of the premature optimization :)

See documentation:
several different b-trees that are incrementally merged as rows are inserted,
updated and deleted. This technique improves performance when writing to an
FTS table, but causes some overhead for full-text queries that use the index.

So you can work with a big FTS tables without using the optimize method.
I use some FTS tables with tens of millions records and effect of the
optimize
isn't measurable.

-- 
Best regards, Alexey Pechnikov.
http://pechnikov.tel/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS management

2011-09-30 Thread Gabriele Favrin


Hello.
I've always got great help in this list so I thank in advance who posts 
here and who will answer to my question.


I've started to use FTS in a web site for search thru a table of sites.
Given a main table containing, among others, fields id, url, nome 
(title) and descrizione (description), I keep the fts table updated in 
this way:


CREATE TRIGGER upd_sito
AFTER UPDATE
ON siti
BEGIN
UPDATE fts3_siti
SET nome=new.nome, url=new.url, descrizione=new.descrizione
WHERE docid=new.id;
END;

CREATE TRIGGER del_sito
AFTER DELETE
ON siti
BEGIN
DELETE FROM fts3_siti WHERE docid=old.id;
END;

CREATE TRIGGER ins_sito
AFTER INSERT
ON siti
BEGIN
INSERT INTO fts3_siti(docid,nome,url,descrizione)
VALUES(new.id,new.nome,new.url,new.descrizione);
END;

The FTS virtual table is defined as:

CREATE VIRTUAL TABLE fts3_siti USING fts3 (
nome TEXT,
url TEXT,
descrizione TEXT);

I use fts3 since my host doesn't yet offer PHP 5.3.8 which contains a 
SQLite version that includes FTS4.


Before extending this feature to other sections of the site I need to 
know if this is an efficient way to keep the FTS table updated.


Also, is there any pragma or so to tell FTS to automatically search for 
parts of words, instead of requiring users to add *?


As said, thanks for any help and sorry for my bad english.



--
Saluti da Gabriele Favrin
http://www.favrin.net
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS: prefix wildcards?

2011-07-21 Thread Sebastian Vogelsang

Hey guys,

as far as I understand the documentation FTS3/4 does not support prefix 
wildcards when searching (e.g. *board = skateboard, longboard, snowboard). 
Is there any way to get this working by now? 
I read that the right tokenizer may help. Are there any open source ones out 
there?

Regards,

Sebastian
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS sqlite3_last_insert_rowid

2011-05-14 Thread Steven Parkes

 Please try the latest code checkin (
 http://www.sqlite.org/src/info/e569f18b98) and let me know if it works any
 better for you.

Thanks. I've already adjusted the code to manually assign keys, but I'll try to 
get back to checking it.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS sqlite3_last_insert_rowid

2011-05-13 Thread Steven Parkes

I gather sqlite3_last_insert_rowid doesn't play well with FTS? I don't see an 
exception noted in the docs but neither are there non-manually managed examples.

I'd prefer not to manually mange them but ...
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS sqlite3_last_insert_rowid

2011-05-13 Thread skywind mailing lists

Hi,

FTS and sqlite3_last_insert_rowid do not work together. This is a known 
shortcoming. Basically this also means that you can't use any triggers 
involving FTS.

Regards,

Hartwig

Am 13.05.2011 um 17:38 schrieb Steven Parkes:

 I gather sqlite3_last_insert_rowid doesn't play well with FTS? I don't see an 
 exception noted in the docs but neither are there non-manually managed 
 examples.
 
 I'd prefer not to manually mange them but ...
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS sqlite3_last_insert_rowid

2011-05-13 Thread Richard Hipp

On Fri, May 13, 2011 at 11:38 AM, Steven Parkes smpar...@smparkes.netwrote:

 I gather sqlite3_last_insert_rowid doesn't play well with FTS? I don't see
 an exception noted in the docs but neither are there non-manually managed
 examples.


Please try the latest code checkin (
http://www.sqlite.org/src/info/e569f18b98) and let me know if it works any
better for you.



 I'd prefer not to manually mange them but ...
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] fts virtual table questions

2011-05-12 Thread Paul Shaffer

I didn't get an answer to my earlier question on fts. I guess it's
difficult area.

Should an fts virtual table always be re-created from scratch when the
database is opened by the application before you start using fts commands?
I see that the fts tables are not deleted when the database is closed. I
have not found enough info on sqlite virtual tables generally.

Do you find it is possible to incrementally add and delete rows in the fts
virtual table (insert rows) as the application runs? As new data is added I
would like to update the fts.

If I delete rows in a table that was originally the source of data for fts
virtual table creation, the fts virtual table retains all the expired data.
In this case do you drop the fts table and recreate it, or try to delete
rows in the fts table?

I would gladly get all this info from docs and not bother you, but the docs
on sqlite fts don't have much practical everyday usage information.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] fts virtual table questions

2011-05-12 Thread Richard Hipp

On Thu, May 12, 2011 at 4:07 PM, Paul Shaffer sqli...@cyberplasm.comwrote:

 I didn't get an answer to my earlier question on fts. I guess it's
 difficult area.

 Should an fts virtual table always be re-created from scratch when the
 database is opened by the application before you start using fts commands?


No. FTS tables persist just like any other table.



 I see that the fts tables are not deleted when the database is closed. I
 have not found enough info on sqlite virtual tables generally.

 Do you find it is possible to incrementally add and delete rows in the fts
 virtual table (insert rows) as the application runs?


Yes.  That works fine for most users.



 As new data is added I
 would like to update the fts.

 If I delete rows in a table that was originally the source of data for fts
 virtual table creation, the fts virtual table retains all the expired data.
 In this case do you drop the fts table and recreate it, or try to delete
 rows in the fts table?


Just delete the rows in the FTS table.



 I would gladly get all this info from docs and not bother you, but the docs
 on sqlite fts don't have much practical everyday usage information.


 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS snippet()

2011-04-14 Thread Gert Van Assche

Drake,

if I do this, I get: SQL logic error or missing database.

Thanks

Gert
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS snippet()

2011-04-13 Thread Gert Van Assche

Hi all,

I'm sure I'm doing something stupid here...

  CREATE VIRTUAL TABLE example USING fts4(TOKEN, CONTEXT);

  INSERT INTO example(TOKEN, CONTEXT) VALUES('one', 'This is just one
sentence.');
  INSERT INTO example(TOKEN, CONTEXT) VALUES('two', 'This is just one
sentence. Sorry, it are two sentences.');
  INSERT INTO example(TOKEN, CONTEXT) VALUES('three', 'More then three
words in one sentence.');

  SELECT snippet(example, '[', ']') FROM example WHERE CONTEXT MATCH
(SELECT TOKEN FROM example);


this returns

  This is just [one] sentence.
  This is just [one] sentence. Sorry, it are two sentences.
  More then three words in [one] sentence.


while I was hoping for

  This is just [one] sentence.
  This is just one sentence. Sorry, it are [two] sentences.
  More then [three] words in one sentence.


Can anyone tell me what I'm doing wrong?

thanks

gert
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS snippet()

2011-04-13 Thread Drake Wilson

Quoth Gert Van Assche ger...@gmail.com, on 2011-04-13 22:35:49 +0200:
   SELECT snippet(example, '[', ']') FROM example WHERE CONTEXT MATCH
 (SELECT TOKEN FROM example);

You're asking to match a single independently arbitrarily chosen token
from anywhere in the table (which is not even the same as matching at
least one token from the table), not whether it matches the one from
the same row.

Can you do WHERE CONTEXT MATCH TOKEN instead?  I think you still need
a full table scan for that, but it should return the right results
unless FTS4 has some relevant restriction on the RHS of a MATCH.

   --- Drake Wilson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

1 2 3 >

1 - 100 of 213 matches

Mail list logo