Sometimes using a temp table is a better idea:
e.g.
-- start by creating a temp table 'tids' that hold the to_ids that -- we are interested in. SELECT to_id INTO TEMP TABLE tids FROM correlation WHERE from_id = 1234 ORDER BY val DESC limit 100;
-- The following temp table makes use of the primary key on -- the correlation table, and the stated goal from the original -- question that: -- from_id > to_id -- and from_id in (tids.to_id) -- and to_id in (tids.to_id)
SELECT t1.to_id AS from_id, t2.to_id INTO TEMP TABLE from_to FROM tids t1, tids t2 WHERE t1.to_id > t2.to_id;
-- Now we can use the from_to table as an index into the correlation -- table.
SELECT c.from_id, c.to_id, c.val FROM from_to JOIN correlation c USING(from_id, to_id) WHERE val > 0.5;
The explain analyze for the final select works out to:
Nested Loop (cost=0.00..50692.00 rows=8488 width=16) (actual time=0.171..150.095 rows=2427 loops=1)
-> Seq Scan on from_to (cost=0.00..79.38 rows=5238 width=8) (actual time=0.006..7.660 rows=4950 loops=1)
-> Index Scan using correlation_pkey on correlation c (cost=0.00..9.63 rows=2 width=16) (actual time=0.024..0.025 rows=0 loops=4950)
Index Cond: (("outer".from_id = c.from_id) AND ("outer".to_id = c.to_id))
Filter: (val > 0.5::double precision)
Total runtime: 152.261 ms
Richard Huxton wrote:
Sean Davis wrote:
I answer my own question, if only for my own records. The following query is about 5-6 times faster than the original. Of course, if anyone else has other ideas, I'd be happy to hear them.
Sean
explain analyze select from_id,to_id,val from exprsdb.correlation where from_id in (select to_id from exprsdb.correlation where from_id=2424 order by val desc limit 100) and to_id in (select to_id from exprsdb.correlation where from_id=2424 order by val desc limit 100) and val>0.6 and to_id<from_id;
Might not be any faster, but you can do this as a self-join with subquery:
SELECT c1.from_id, c1.to_id, c1.val FROM correlation c1, ( SELECT to_id FROM correlation WHERE from_id=2424 ORDER BY val DESC LIMIT 100 ) AS c2 ( SELECT to_id FROM correlation WHERE from_id=2424 ORDER BY val DESC LIMIT 100 ) AS c3 WHERE c1.from_id = c2.to_id AND c1.to_id = c3.to_id AND c1.val > 0.5 AND c1.to_id < from_id ;
I think PG should be smart enough nowadays to figure out these two queries are basically the same.
-- Edmund Bacon <[EMAIL PROTECTED]>
---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly