Philip Bennefall wrote:
> I hate to be cluttering up the list in this fashion, but I have come across
> an issue that I cannot seem to find a solution for.
>
> I am using two fts tables, one that uses the normal tokenizer and another
> that uses the porter stemmer, so that I can search the same dataset with and
> without porter. For the porter stemmer table, I have set the content option
> to point to the other fts table. Like this:
>
> CREATE VIRTUAL TABLE main_brain USING fts4(phrase, response, matchinfo=fts3,
> tokenize=simple, order=desc);
>
> CREATE VIRTUAL TABLE porter_brain USING fts4(content="main_brain", phrase,
> response, matchinfo=fts3, tokenize=porter, order=desc);
>
> Then I do the following:
>
> INSERT INTO main_brain(phrase, response) VALUES('hello', 'Hi there!');
>
> So you'd now think that main_brain should have this content in it, but the
> porter_brain table should be empty. At least, that is what the documentation
> on fts4 seems to indicate as it points out that it is my own responsibility
> to make sure the tables are in sync. But:
>
> SELECT * FROM porter_brain;
>
> Produces:
>
> hello|Hi there!
As porter_brain takes all content from main_brain, full scan result looks
"correct" ("SELECT FROM porter_brain" internally replaced with "SELECT FROM
main_brain"). But before you execute INSERT below, all fts indexes are missing,
so any request that utilize them, like
SELECT * FROM porter_brain WHERE phrase MATCH 'hello';
will fail.
> And:
>
> SELECT * FROM main_brain;
>
> Gives the same result. So it seems as though some internal synchronization
> between these tables is going on. How should I handle this? The documentation
> suggests a statement like the following in one of its examples:
>
> INSERT INTO t3(docid, b, c) SELECT id, b, c FROM t2;
> Where I guess t3 would correspond to porter_brain and t2 would be main_brain
> in my case. But I don't seem to need to do this at all.
INSERT INTO porter_brain (docid, phrase, response)
SELECT docid, phrase, response FROM main_brain;
Before you execute this insert statement, porter_brain state is *inconsistent*
and it cannot be used properly.
[FWIW, most natural place for this insert would be AFTER INSERT trigger ... but
as main_brain is VIRTUAL TABLE, it is impossible; you can consider using
external content table for both virtual tables with trigger:
CREATE TABLE tblContent(phrase TEXT, response TEXT);
CREATE VIRTUAL TABLE main_brain USING fts4(content="tblContent", phrase,
response, matchinfo=fts3, tokenize=simple, order=desc);
CREATE VIRTUAL TABLE porter_brain USING fts4(content="tblContent", phrase,
response, matchinfo=fts3, tokenize=porter, order=desc);
CREATE TRIGGER trgSync AFTER INSERT ON tblContent FOR EACH ROW
BEGIN
INSERT INTO main_brain (docid, phrase, response)
VALUES (NEW.rowid, NEW.phrase, NEW.response);
INSERT INTO porter_brain (docid, phrase, response)
VALUES (NEW.rowid, NEW.phrase, NEW.response);
END;
].
> Can anyone throw some light on this?
>
> Thanks in advance.
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users