Rense,

>As for the ranges of n1 and n1: they are both roughly between 60000
>and 12000000 .
>
>Here are the results of  EXPLAIN QUERY PLAN SELECT n1, n2 FROM table1
>Where n1 < n2 INTERSECT SELECT n2, n1 FROM table1 Where n2 < n1;
>
>1|0|0|SCAN TABLE table1 (~437976176 rows)
>2|0|0|SCAN TABLE table1 (~437976176 rows)
>0|0|0|COMPOUND SUBQUERIES 1 AND 2 USING TEMP B-TREE (INTERSECT)

If you can change your schema at all and your application supports 
taking a bit more time for inserts, I believe you can achieve much 
better query times.

I tried the following schema:
CREATE TABLE "T" (
   "flags" INTEGER,
   "n1" INTEGER DEFAULT (random() % 100),
   "n2" INTEGER DEFAULT (random() % 100),
   CONSTRAINT "T_order" CHECK(((n1 <= n2) and (flags between 1 and 3))));

>In my database, the mirrored pairs vastly outnumber the non-mirrored
>ones, to the extent that the non-mirrored pairs are actually extremely
>rare (but relevant).

Given the fact that mirrored pairs dominate, if you store new pairs
  (N1, N2)
as
  (flags, n1, n2)
with n1 <= n2 and flags bit 0 set if (n1, n2) = (N1, N2)
               and flags bit 1 set if (n1, n2) = (N2, N1)
your query becomes:
   select * from T where flags = 3;

If you build an index on flags, e.g.
   CREATE INDEX "ixFlags" ON "T" ("flags");
then it will be used (but it is regularly pointed out that in cases 
where the distribution of values queried doesn't fall in small bins, an 
index may in fact slow things down).

Since your have mostly mirrored pairs, using the schema above will 
decrease the size of your DB (or keep it about the same if you use the 
index) but and --I see that as a big bonus-- essentially half the 
number of rows.

Note that, if it can be of any value to your application, you can even 
have the mirrored pairs sorted if you make the index
   CREATE INDEX "ixFlags" ON "T" ("flags", "n1");
Of course this compound index will take some more room, but depending 
on your needs it may prove useful.

The price to pay is having to query first before insert (or have a 
[slow] trigger do the work for you).  Could your application support that?

Sounds to me it could be worth trying on a reduced DB.



--
<mailto:j...@q-e-d.org>j...@antichoc.net  

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to