If this is too off topic, please let me know and I'll sign up on a postgres list to get help. But this is related to my use of postgis and If anyone knows this stuff, it's you guys.
I have an example query that I expect to be much faster, but my main concern is we are about to do some visualization of historical congestion data which will require queries across much larger data sets - like 150 million records a day. We are about to test using partitions but the number per table will still be much larger than what I am dealing with now. So here is a query I would think would be much faster than 43 seconds for two tables, one with about 97k rows, and the other 3.2 million. explain select count(l.*) from links l, source_link ld where l.link_id = ld.link_id; /* 'Aggregate (cost=174731.72..174731.73 rows=1 width=32)' ' -> Hash Join (cost=13024.27..166784.14 rows=3179029 width=32)' ' Hash Cond: (ld.link_id = l.link_id)' ' -> Seq Scan on source_link ld (cost=0.00..58282.29 rows=3179029 width=10)' ' -> Hash (cost=10963.12..10963.12 rows=96812 width=42)' ' -> Seq Scan on links l (cost=0.00..10963.12 rows=96812 width=42)' */ Each table has an index on link_id, defined like this CREATE INDEX links_link_id_idx ON links USING btree (link_id); CREATE INDEX source_link_link_id_idx ON source_link USING btree (link_id); Shouldn't this index prevent these sequential scans, or am I misreading this? Should this really take 43 seconds? thanks for any advice, charles _______________________________________________ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users