>>>>> Jim Morris writes:
>>>>> On 10/6/2011 10:43 PM, Ivan Shmakov wrote:
>>>>> Jim Morris writes:
[…]
>>> INSERT INTO fts3_table (a,b,c)
>>> SELECT 'an A','a B','a C'
>>> WHERE NOT EXISTS
>>> (SELECT DISTINCT a,b,c
>>> FROM fts3_table
>>> WHERE a='an A' AND b='a B' AND c='a C');
>> However, I wonder, would the following (slightly more concise) query
>> imply any performance loss in comparison to the one above?
>> INSERT INTO fts3_table (a, b, c)
>> SELECT 'an A', 'a B', 'a C'
>> EXCEPT SELECT DISTINCT a, b, c
>> FROM fts3_table;
>> Also, I'm curious if DISTINCT may cause any performance loss in the
>> case that the columns in question are constrained by an UNIQUE
>> index? Like:
>> CREATE UNIQUE INDEX "foo-unique"
>> ON "foo" (a, b, c);
> I don't know for sure. You would need to do some testing to
> determine performance issues. I wouldn't use distinct if the values
> are already guaranteed to be unique
As per EXPLAIN, SELECT DISTINCT indeed doesn't take the
uniqueness constraint into consideration.
> or in a not exits clause since it may have additional overhead.
Surprisingly enough, when used in the EXCEPT part of the query,
DISTINCT has no effect on the VM code (as shown by EXPLAIN)!
At a first glance, the code for the NOT EXISTS variant (sans
INSERT; hopefully it doesn't matter) seems simpler (one
operation less in total, one less Next, has no OpenEphemeral's,
and IdxGE instead of IdxInsert and IdxDelete):
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Trace 0 0 0 00
1 Goto 0 21 0 00
2 OpenRead 0 2 1 1 00
3 Rewind 0 19 0 00
4 Integer 0 2 0 00
5 Integer 1 3 0 00
6 OpenRead 2 3 0 keyinfo(1,BINARY) 00
7 Column 0 0 4 00
8 IsNull 4 14 0 00
9 SeekGe 2 14 4 1 00
10 IdxGE 2 14 4 1 01
11 Integer 1 2 0 00
12 IfZero 3 14 -1 00
13 Next 2 10 0 00
14 Close 2 0 0 00
15 If 2 18 1 00
16 Column 0 0 7 00
17 ResultRow 7 1 0 00
18 Next 0 4 0 01
19 Close 0 0 0 00
20 Halt 0 0 0 00
21 Transaction 0 0 0 00
22 VerifyCookie 0 2 0 00
23 Transaction 1 0 0 00
24 VerifyCookie 1 1 0 00
25 TableLock 1 2 0 temp.staged 00
26 TableLock 0 2 0 foo 00
27 Goto 0 2 0 00
The code for the EXCEPT variant is:
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Trace 0 0 0 00
1 OpenEphemeral 2 1 0 keyinfo(1,BINARY) 00
2 Goto 0 22 0 00
3 OpenRead 1 2 1 1 00
4 Rewind 1 9 0 00
5 Column 1 0 1 00
6 MakeRecord 1 1 2 00
7 IdxInsert 2 2 0 00
8 Next 1 5 0 01
9 Close 1 0 0 00
10 OpenRead 0 2 0 2 00
11 Rewind 0 15 0 00
12 Column 0 1 1 00
13 IdxDelete 2 1 1 00
14 Next 0 12 0 01
15 Close 0 0 0 00
16 Rewind 2 20 0 00
17 Column 2 0 3 00
18 ResultRow 3 1 0 00
19 Next 2 17 0 00
20 Close 2 0 0 00
21 Halt 0 0 0 00
22 Transaction 0 0 0 00
23 VerifyCookie 0 2 0 00
24 Transaction 1 0 0 00
25 VerifyCookie 1 1 0 00
26 TableLock 1 2 0 temp.staged 00
27 TableLock 0 2 0 foo 00
28 Goto 0 3 0 00
The SQL code used for testing is as follows.
BEGIN;
CREATE TABLE "foo" (
"id" INTEGER PRIMARY KEY,
"bar" TEXT NOT NULL);
CREATE UNIQUE INDEX "foo-unique"
ON "foo" ("bar");
INSERT INTO "foo" ("bar") VALUES ('one');
INSERT INTO "foo" ("bar") VALUES ('two');
INSERT INTO "foo" ("bar") VALUES ('three');
CREATE TEMP TABLE "temp.staged" ("bar" TEXT NOT NULL);
INSERT INTO "temp.staged" ("bar") VALUES ('one');
INSERT INTO "temp.staged" ("bar") VALUES ('four');
.explain ON
EXPLAIN
SELECT DISTINCT "bar"
FROM "foo";
EXPLAIN
SELECT "bar"
FROM "foo";
-- INSERT INTO "foo" ("bar")
EXPLAIN
SELECT "bar"
FROM "temp.staged"
EXCEPT SELECT DISTINCT "bar"
FROM "foo";
-- produces exactly the same output as the one above:
EXPLAIN
SELECT "bar"
FROM "temp.staged"
EXCEPT SELECT "bar"
FROM "foo";
EXPLAIN
SELECT s."bar"
FROM "temp.staged" s
WHERE NOT EXISTS (SELECT 1
FROM "foo" f
WHERE f."bar" = s."bar");
-- SELECT *
-- FROM "foo";
END;
PS. As of SQLite version 3.7.3 debian 1.
--
FSF associate member #7257
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users