Re: [sqlite] Count Rows Inserted into FTS5 Index w/ External Content Table
Interesting! I appreciate the detailed response. I don't think the shadow table digging fits our risk profile exactly :), but it's interesting to know where to look if we want to check ourselves. I realized after rereading all of this that ultimately we want to keep track of the max rowid inserted into the index, so I think we'll just track that separately on our own. Thanks again! Ben > On Oct 20, 2019, at 1:45 AM, Dan Kennedy wrote: > > >> On 19/10/62 06:31, Ben Asher wrote: >> Hello! I'm trying to write some code to keep an external content table in >> sync with the index. To do this, I need to be able to get some state about >> the index: either how many rows have been inserted so far or the max rowid >> that has been inserted into the index. However, if I try to run queries >> (where "fts" is the name of the FTS5 virtual table) like "SELECT COUNT(*) >> FROM fts" or "SELECT MAX(rowid) FROM fts", the result always ends up being >> an answer as if I had run those queries on the external content table. Is >> there some other way I can query the state in this case? > > You can, but it's not quite as easy as it could be. Assuming you're not using > the "columnsize=0" option, the xRowCount() API, which is only available from > within an auxiliary function returns the value you want: > > https://sqlite.org/fts5.html#xRowCount > > The xRowCount() will just fall back to the "SELECT count(*)..." method, which > will report the number of rows in the external content table, not the number > of rows that have been added to the index. > > So if you're prepared to write an auxiliary fts5 function in C that invokes > the xRowCount() API and returns its value, you could do something like: > > SELECT row_count(text) FROM text LIMIT 1; > > Assuming your fts5 table is named "text". Or, if you include the "matchinfo" > demo code in your app from here: > > https://sqlite.org/src/artifact/08c11ec968148d4c > > in your build: > > SELECT matchinfo(text, 'n') FROM text LIMIT 1; > > Both of these will only work if there is at least one row in the external > content table (as otherwise the query will return zero rows). If your > external content table is sometimes empty, you'll have to figure out a > full-text query that always returns at least one row to use. > > Another way to go, if you're a risk-taking sort, is to read the same value > directly from the fts5 shadow tables. The query: > > SELECT block FROM test_data WHERE id=1; > > returns a blob that consists of a series of SQLite varints. The first of > those varints is the number of rows in the index. Again, assuming your fts5 > table is named "test". > > This isn't actually a public interface, so you might get burned if it changes > at some point. I think that's pretty unlikely, but no promises! > > On consideration, I'd probably go for the direct query on the test_data > table. Not least because if you use the xRowCount() API from a non-full-text > query you will need this fix, which won't be released until 3.31.0: > > https://sqlite.org/src/info/b528bdcd45db1b78 > > But have a test in place to ensure it doesn't break when you upgrade SQLite. > And if you can, build SQLite directly into the app (almost always a good > idea), don't use the system version. > > Good luck, > > Dan. > > > > > >> >> Thanks! >> >> Ben >> ___ >> sqlite-users mailing list >> sqlite-users@mailinglists.sqlite.org >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Count Rows Inserted into FTS5 Index w/ External Content Table
On 19/10/62 06:31, Ben Asher wrote: Hello! I'm trying to write some code to keep an external content table in sync with the index. To do this, I need to be able to get some state about the index: either how many rows have been inserted so far or the max rowid that has been inserted into the index. However, if I try to run queries (where "fts" is the name of the FTS5 virtual table) like "SELECT COUNT(*) FROM fts" or "SELECT MAX(rowid) FROM fts", the result always ends up being an answer as if I had run those queries on the external content table. Is there some other way I can query the state in this case? You can, but it's not quite as easy as it could be. Assuming you're not using the "columnsize=0" option, the xRowCount() API, which is only available from within an auxiliary function returns the value you want: https://sqlite.org/fts5.html#xRowCount The xRowCount() will just fall back to the "SELECT count(*)..." method, which will report the number of rows in the external content table, not the number of rows that have been added to the index. So if you're prepared to write an auxiliary fts5 function in C that invokes the xRowCount() API and returns its value, you could do something like: SELECT row_count(text) FROM text LIMIT 1; Assuming your fts5 table is named "text". Or, if you include the "matchinfo" demo code in your app from here: https://sqlite.org/src/artifact/08c11ec968148d4c in your build: SELECT matchinfo(text, 'n') FROM text LIMIT 1; Both of these will only work if there is at least one row in the external content table (as otherwise the query will return zero rows). If your external content table is sometimes empty, you'll have to figure out a full-text query that always returns at least one row to use. Another way to go, if you're a risk-taking sort, is to read the same value directly from the fts5 shadow tables. The query: SELECT block FROM test_data WHERE id=1; returns a blob that consists of a series of SQLite varints. The first of those varints is the number of rows in the index. Again, assuming your fts5 table is named "test". This isn't actually a public interface, so you might get burned if it changes at some point. I think that's pretty unlikely, but no promises! On consideration, I'd probably go for the direct query on the test_data table. Not least because if you use the xRowCount() API from a non-full-text query you will need this fix, which won't be released until 3.31.0: https://sqlite.org/src/info/b528bdcd45db1b78 But have a test in place to ensure it doesn't break when you upgrade SQLite. And if you can, build SQLite directly into the app (almost always a good idea), don't use the system version. Good luck, Dan. Thanks! Ben ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Count Rows Inserted into FTS5 Index w/ External Content Table
I should have included this in the first place. Here are the queries used to setup this table and an example: CREATE TABLE text (text TEXT NONNULL); CREATE VIRTUAL TABLE text_fts_index USING fts5(text, content=text, content_rowid=rowid, prefix='2 3 4 5 6 7 8 9 10', tokenize='unicode61'); INSERT INTO text VALUES ('hello'); -- I was hoping the following would return 0, but it returns 1, presumably because it's getting the answer from the external content table SELECT COUNT(content_rowid) FROM text_fts_index; Thanks again! Ben On Fri, Oct 18, 2019 at 4:31 PM Ben Asher wrote: > Hello! I'm trying to write some code to keep an external content table in > sync with the index. To do this, I need to be able to get some state about > the index: either how many rows have been inserted so far or the max rowid > that has been inserted into the index. However, if I try to run queries > (where "fts" is the name of the FTS5 virtual table) like "SELECT COUNT(*) > FROM fts" or "SELECT MAX(rowid) FROM fts", the result always ends up being > an answer as if I had run those queries on the external content table. Is > there some other way I can query the state in this case? > > Thanks! > > Ben > -- Ben ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Count Rows Inserted into FTS5 Index w/ External Content Table
Hello! I'm trying to write some code to keep an external content table in sync with the index. To do this, I need to be able to get some state about the index: either how many rows have been inserted so far or the max rowid that has been inserted into the index. However, if I try to run queries (where "fts" is the name of the FTS5 virtual table) like "SELECT COUNT(*) FROM fts" or "SELECT MAX(rowid) FROM fts", the result always ends up being an answer as if I had run those queries on the external content table. Is there some other way I can query the state in this case? Thanks! Ben ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users