Re: [sqlite] Count Rows Inserted into FTS5 Index w/ External Content Table

2019-10-20 Thread Ben Asher
Interesting! I appreciate the detailed response. I don't think the shadow table 
digging fits our risk profile exactly :), but it's interesting to know where to 
look if we want to check ourselves. I realized after rereading all of this that 
ultimately we want to keep track of the max rowid inserted into the index, so I 
think we'll just track that separately on our own.

Thanks again!

Ben

> On Oct 20, 2019, at 1:45 AM, Dan Kennedy  wrote:
> 
> 
>> On 19/10/62 06:31, Ben Asher wrote:
>> Hello! I'm trying to write some code to keep an external content table in
>> sync with the index. To do this, I need to be able to get some state about
>> the index: either how many rows have been inserted so far or the max rowid
>> that has been inserted into the index. However, if I try to run queries
>> (where "fts" is the name of the FTS5 virtual table) like "SELECT COUNT(*)
>> FROM fts" or "SELECT MAX(rowid) FROM fts", the result always ends up being
>> an answer as if I had run those queries on the external content table. Is
>> there some other way I can query the state in this case?
> 
> You can, but it's not quite as easy as it could be. Assuming you're not using 
> the "columnsize=0" option, the xRowCount() API, which is only available from 
> within an auxiliary function returns the value you want:
> 
> https://sqlite.org/fts5.html#xRowCount
> 
> The xRowCount() will just fall back to the "SELECT count(*)..." method, which 
> will report the number of rows in the external content table, not the number 
> of rows that have been added to the index.
> 
> So if you're prepared to write an auxiliary fts5 function in C that invokes 
> the xRowCount() API and returns its value, you could do something like:
> 
>   SELECT row_count(text) FROM text LIMIT 1;
> 
> Assuming your fts5 table is named "text". Or, if you include the "matchinfo" 
> demo code in your app from here:
> 
> https://sqlite.org/src/artifact/08c11ec968148d4c
> 
> in your build:
> 
>   SELECT matchinfo(text, 'n') FROM text LIMIT 1;
> 
> Both of these will only work if there is at least one row in the external 
> content table (as otherwise the query will return zero rows). If your 
> external content table is sometimes empty, you'll have to figure out a 
> full-text query that always returns at least one row to use.
> 
> Another way to go, if you're a risk-taking sort, is to read the same value 
> directly from the fts5 shadow tables. The query:
> 
>   SELECT block FROM test_data WHERE id=1;
> 
> returns a blob that consists of a series of SQLite varints. The first of 
> those varints is the number of rows in the index. Again, assuming your fts5 
> table is named "test".
> 
> This isn't actually a public interface, so you might get burned if it changes 
> at some point. I think that's pretty unlikely, but no promises!
> 
> On consideration, I'd probably go for the direct query on the test_data 
> table. Not least because if you use the xRowCount() API from a non-full-text 
> query you will need this fix, which won't be released until 3.31.0:
> 
>   https://sqlite.org/src/info/b528bdcd45db1b78
> 
> But have a test in place to ensure it doesn't break when you upgrade SQLite. 
> And if you can, build SQLite directly into the app (almost always a good 
> idea), don't use the system version.
> 
> Good luck,
> 
> Dan.
> 
> 
> 
> 
> 
>> 
>> Thanks!
>> 
>> Ben
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Count Rows Inserted into FTS5 Index w/ External Content Table

2019-10-20 Thread Dan Kennedy


On 19/10/62 06:31, Ben Asher wrote:

Hello! I'm trying to write some code to keep an external content table in
sync with the index. To do this, I need to be able to get some state about
the index: either how many rows have been inserted so far or the max rowid
that has been inserted into the index. However, if I try to run queries
(where "fts" is the name of the FTS5 virtual table) like "SELECT COUNT(*)
FROM fts" or "SELECT MAX(rowid) FROM fts", the result always ends up being
an answer as if I had run those queries on the external content table. Is
there some other way I can query the state in this case?


You can, but it's not quite as easy as it could be. Assuming you're not 
using the "columnsize=0" option, the xRowCount() API, which is only 
available from within an auxiliary function returns the value you want:


https://sqlite.org/fts5.html#xRowCount

The xRowCount() will just fall back to the "SELECT count(*)..." method, 
which will report the number of rows in the external content table, not 
the number of rows that have been added to the index.


So if you're prepared to write an auxiliary fts5 function in C that 
invokes the xRowCount() API and returns its value, you could do 
something like:


  SELECT row_count(text) FROM text LIMIT 1;

Assuming your fts5 table is named "text". Or, if you include the 
"matchinfo" demo code in your app from here:


https://sqlite.org/src/artifact/08c11ec968148d4c

in your build:

  SELECT matchinfo(text, 'n') FROM text LIMIT 1;

Both of these will only work if there is at least one row in the 
external content table (as otherwise the query will return zero rows). 
If your external content table is sometimes empty, you'll have to figure 
out a full-text query that always returns at least one row to use.


Another way to go, if you're a risk-taking sort, is to read the same 
value directly from the fts5 shadow tables. The query:


  SELECT block FROM test_data WHERE id=1;

returns a blob that consists of a series of SQLite varints. The first of 
those varints is the number of rows in the index. Again, assuming your 
fts5 table is named "test".


This isn't actually a public interface, so you might get burned if it 
changes at some point. I think that's pretty unlikely, but no promises!


On consideration, I'd probably go for the direct query on the test_data 
table. Not least because if you use the xRowCount() API from a 
non-full-text query you will need this fix, which won't be released 
until 3.31.0:


  https://sqlite.org/src/info/b528bdcd45db1b78

But have a test in place to ensure it doesn't break when you upgrade 
SQLite. And if you can, build SQLite directly into the app (almost 
always a good idea), don't use the system version.


Good luck,

Dan.







Thanks!

Ben
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Count Rows Inserted into FTS5 Index w/ External Content Table

2019-10-18 Thread Ben Asher
I should have included this in the first place. Here are the queries used
to setup this table and an example:

CREATE TABLE text (text TEXT NONNULL);
CREATE VIRTUAL TABLE text_fts_index USING fts5(text, content=text,
content_rowid=rowid, prefix='2 3 4 5 6 7 8 9 10', tokenize='unicode61');
INSERT INTO text VALUES ('hello');

-- I was hoping the following would return 0, but it returns 1, presumably
because it's getting the answer from the external content table
SELECT COUNT(content_rowid) FROM text_fts_index;

Thanks again!

Ben

On Fri, Oct 18, 2019 at 4:31 PM Ben Asher  wrote:

> Hello! I'm trying to write some code to keep an external content table in
> sync with the index. To do this, I need to be able to get some state about
> the index: either how many rows have been inserted so far or the max rowid
> that has been inserted into the index. However, if I try to run queries
> (where "fts" is the name of the FTS5 virtual table) like "SELECT COUNT(*)
> FROM fts" or "SELECT MAX(rowid) FROM fts", the result always ends up being
> an answer as if I had run those queries on the external content table. Is
> there some other way I can query the state in this case?
>
> Thanks!
>
> Ben
>


-- 
Ben
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Count Rows Inserted into FTS5 Index w/ External Content Table

2019-10-18 Thread Ben Asher
Hello! I'm trying to write some code to keep an external content table in
sync with the index. To do this, I need to be able to get some state about
the index: either how many rows have been inserted so far or the max rowid
that has been inserted into the index. However, if I try to run queries
(where "fts" is the name of the FTS5 virtual table) like "SELECT COUNT(*)
FROM fts" or "SELECT MAX(rowid) FROM fts", the result always ends up being
an answer as if I had run those queries on the external content table. Is
there some other way I can query the state in this case?

Thanks!

Ben
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users