Hello, I am developing a small app for determining the available combinations of reagents stored in a sqlite database where no two reagents have the same conjugate. I have tried two approaches to this query, one has very good performance and the other is very poor. I am trying to understand why the second has such bad performance. Both queries return the correct answer.
The first option creates temp tables populated from select statements, then a cross join between all of the temp tables with a big where clause to get just the rows where no conjugate appears more than once. This query executes in < 2 seconds. CREATE TEMP TABLE 'CD4' AS SELECT Conjugate FROM Catalog WHERE Species IN ('Human') AND Conjugate NOT IN ('UNLB', 'Purified') AND ProductName='CD4'; CREATE TEMP TABLE 'CD19' AS SELECT Conjugate FROM Catalog WHERE Species IN ('Human') AND Conjugate NOT IN ('UNLB', 'Purified') AND ProductName='CD19'; CREATE TEMP TABLE 'CD20' AS SELECT Conjugate FROM Catalog WHERE Species IN ('Human') AND Conjugate NOT IN ('UNLB', 'Purified') AND ProductName='CD20'; CREATE TEMP TABLE 'CD21' AS SELECT Conjugate FROM Catalog WHERE Species IN ('Human') AND Conjugate NOT IN ('UNLB', 'Purified') AND ProductName='CD21'; CREATE TEMP TABLE 'CD22' AS SELECT Conjugate FROM Catalog WHERE Species IN ('Human') AND Conjugate NOT IN ('UNLB', 'Purified') AND ProductName='CD22'; CREATE TEMP TABLE 'CD23' AS SELECT Conjugate FROM Catalog WHERE Species IN ('Human') AND Conjugate NOT IN ('UNLB', 'Purified') AND ProductName='CD23'; CREATE TEMP TABLE 'CD24' AS SELECT Conjugate FROM Catalog WHERE Species IN ('Human') AND Conjugate NOT IN ('UNLB', 'Purified') AND ProductName='CD24'; SELECT count(*) FROM 'CD4', 'CD19', 'CD20', 'CD21', 'CD22', 'CD23', 'CD24' WHERE 'CD4'.Conjugate <> 'CD19'.Conjugate AND 'CD4'.Conjugate <> 'CD20'.Conjugate AND 'CD4'.Conjugate <> 'CD21'.Conjugate AND 'CD4'.Conjugate <> 'CD22'.Conjugate AND 'CD4'.Conjugate <> 'CD23'.Conjugate AND 'CD4'.Conjugate <> 'CD24'.Conjugate AND 'CD19'.Conjugate <> 'CD20'.Conjugate AND 'CD19'.Conjugate <> 'CD21'.Conjugate AND 'CD19'.Conjugate <> 'CD22'.Conjugate AND 'CD19'.Conjugate <> 'CD23'.Conjugate AND 'CD19'.Conjugate <> 'CD24'.Conjugate AND 'CD20'.Conjugate <> 'CD21'.Conjugate AND 'CD20'.Conjugate <> 'CD22'.Conjugate AND 'CD20'.Conjugate <> 'CD23'.Conjugate AND 'CD20'.Conjugate <> 'CD24'.Conjugate AND 'CD21'.Conjugate <> 'CD22'.Conjugate AND 'CD21'.Conjugate <> 'CD23'.Conjugate AND 'CD21'.Conjugate <> 'CD24'.Conjugate AND 'CD22'.Conjugate <> 'CD23'.Conjugate AND 'CD22'.Conjugate <> 'CD24'.Conjugate AND 'CD23'.Conjugate <> 'CD24'.Conjugate ORDER BY 'CD4'.Conjugate,'CD19'.Conjugate, 'CD20'.Conjugate, 'CD21'.Conjugate, 'CD22'.Conjugate, 'CD23'.Conjugate, 'CD24'.Conjugate; DROP TABLE 'CD4'; DROP TABLE 'CD19'; DROP TABLE 'CD20'; DROP TABLE 'CD21'; DROP TABLE 'CD22'; DROP TABLE 'CD23'; DROP TABLE 'CD24'; The second approach uses nested select statements rather than temp tables. The logic is the same as the query above, only that the temp tables have been replaced by the equivalent select statements. The performance of this query is horrible, executing in > 3 minutes. SELECT count(*) FROM (SELECT Conjugate FROM Catalog WHERE Species IN ('Human') AND Conjugate NOT IN ('UNLB', 'Purified') AND ProductName='CD4') AS 'CD4', (SELECT Conjugate FROM Catalog WHERE Species IN ('Human') AND Conjugate NOT IN ('UNLB', 'Purified') AND ProductName='CD19') AS 'CD19', (SELECT Conjugate FROM Catalog WHERE Species IN ('Human') AND Conjugate NOT IN ('UNLB', 'Purified') AND ProductName='CD20') AS 'CD20', (SELECT Conjugate FROM Catalog WHERE Species IN ('Human') AND Conjugate NOT IN ('UNLB', 'Purified') AND ProductName='CD21') AS 'CD21', (SELECT Conjugate FROM Catalog WHERE Species IN ('Human') AND Conjugate NOT IN ('UNLB', 'Purified') AND ProductName='CD22') AS 'CD22', (SELECT Conjugate FROM Catalog WHERE Species IN ('Human') AND Conjugate NOT IN ('UNLB', 'Purified') AND ProductName='CD23') AS 'CD23', (SELECT Conjugate FROM Catalog WHERE Species IN ('Human') AND Conjugate NOT IN ('UNLB', 'Purified') AND ProductName='CD24') AS 'CD24' WHERE 'CD4'.Conjugate <> 'CD19'.Conjugate AND 'CD4'.Conjugate <> 'CD20'.Conjugate AND 'CD4'.Conjugate <> 'CD21'.Conjugate AND 'CD4'.Conjugate <> 'CD22'.Conjugate AND 'CD4'.Conjugate <> 'CD23'.Conjugate AND 'CD4'.Conjugate <> 'CD24'.Conjugate AND 'CD19'.Conjugate <> 'CD20'.Conjugate AND 'CD19'.Conjugate <> 'CD21'.Conjugate AND 'CD19'.Conjugate <> 'CD22'.Conjugate AND 'CD19'.Conjugate <> 'CD23'.Conjugate AND 'CD19'.Conjugate <> 'CD24'.Conjugate AND 'CD20'.Conjugate <> 'CD21'.Conjugate AND 'CD20'.Conjugate <> 'CD22'.Conjugate AND 'CD20'.Conjugate <> 'CD23'.Conjugate AND 'CD20'.Conjugate <> 'CD24'.Conjugate AND 'CD21'.Conjugate <> 'CD22'.Conjugate AND 'CD21'.Conjugate <> 'CD23'.Conjugate AND 'CD21'.Conjugate <> 'CD24'.Conjugate AND 'CD22'.Conjugate <> 'CD23'.Conjugate AND 'CD22'.Conjugate <> 'CD24'.Conjugate AND 'CD23'.Conjugate <> 'CD24'.Conjugate ORDER BY 'CD4'.Conjugate, 'CD19'.Conjugate, 'CD20'.Conjugate, 'CD21'.Conjugate, 'CD22'.Conjugate, 'CD23'.Conjugate, 'CD24'.Conjugate; I've observed these execution times using the sqlite command line utility. Why is this? Is my index on the catalog table not being used for some reason? I would like to avoid creating temp tables in the DB because of issues with the Qt SQLite driver I'm using which keeps open connections to the DB and locks these temp tables during some operations. Below is my database schema: DROP TABLE IF EXISTS catalog; CREATE TABLE catalog( PartNumber TEXT PRIMARY KEY, ProductName TEXT, Species TEXT, Conjugate TEXT, Allele TEXT, Format TEXT, Size TEXT, Application TEXT, ProductLine TEXT, Status TEXT, PageNumber TEXT, LotNumber TEXT, ExpirationDate TEXT, UserDefined INTEGER); CREATE INDEX idx_cat ON catalog(Conjugate, PartNumber); Thank You! -Bryan [EMAIL PROTECTED] _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users