-----Original Message----- 
From: Max Vlasov
Sent: Thursday, June 23, 2011 5:09 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Query with UNION on large table

On Wed, Jun 22, 2011 at 9:25 PM, Rense Corten <rcor...@gmail.com> wrote:

>
> CREATE TABLE table2 AS SELECT n1,n2 FROM (SELECT n1, n2 FROM table1
> UNION SELECT n2 AS n1, n1 AS n2 FROM table1) WHERE(n1<n2) GROUP BY
> n1,n2;
>
> This has the desired result on a small example, but when I try this on
> my actual table which has about 800 million rows, the query never
> seems to complete. It has been running for a couple of days now, and
> it doesn't seem sqlite is still doing anything (cpu usage dropped to
> almost zero)...
>

I noticed that fewer reads/writes will be with the following variant

SELECT * FROM (SELECT n1, n2 FROM table1 Where n1 < n2)
UNION
SELECT * FROM (SELECT n2, n1 FROM table1 Where n2 < n1)

since this one decreases the sizes of the tables that should be ordered.

Max Vlasov
_______________________________________________

Both the original and Max's streamlined query produce the wrong results for 
what is intended. Revising Max's produces the most efficient one submitted 
so far:

SELECT n1, n2 FROM table1 Where n1 < n2
INTERSECT
SELECT n2, n1 FROM table1 Where n2 < n1;

That's essentially the same as what I had earlier proposed but with the 
added condition on the first SELECT that reduces the result set in the 
temporary B-Tree (I'm getting deeper than I'm comfortable with now but 
that's what I surmise from QUERY PLAN...).

My observations are on two datasets of 100,000 records - one that has but 
one mirrored pairings in {0-65535}, the other with almost 45,000 mirrored 
pairings (45 distinct) in {0-9}. Execution is faster on the second set by a 
factor of almost 3.

Tom 

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to