On Tue, 2013-02-05 at 22:59 +0000, Simon Slavin wrote:

> This might be faster if you have an index for TextNGram(text, size,
> ngram).

Thanks, I'll definitely try that!

> >         (SELECT tn.ngram
> >            FROM TextNGram tn, Text t
> >            WHERE t.label != Text.label
> >               AND t.label != ''
> >               AND tn.text = t.id
> >               AND tn.ngram = TextNGram.ngram
> Is the last line not a test to see that "Text.label != Text.label" and
> "TextNGram.ngram = TextNGram.ngram" ?

I'm not sure what you mean here. The subselect should find any
TextNGram.ngram that does not share the label of a matching row in the
parent select, but which does share the actual ngram.

Are you suggesting that the test is redundant, or incorrect, or
something else? From my tests it appears to be correct, at least.

> The whole thing might work faster if you first load a representative
> dataset into all tables, then run the ANALYZE command.

As part of the database creation step of the program that uses the
database, I run ANALYZE after adding the data and indices.

Jamie

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to