I am in process of migrating a database from Pg 7.4.3 to Pg 8.0.3. I
dumped and reloaded the data, and things look good, until I try to
work with it. When I try to make a query to a view based on a large
join (select * from 'view' limit 10), the 7.4.3 query works very
well, and the indexes are being used well. But, when I go to 8.0.3,
there are some very costly joins, and the query is not workable.
(I've put the 'EXPLAIN' results below for the same database, and the
same query.) Also, the memory settings have been made the same for
7.4 vs. 8.0.
I should point out that some of the indexes (and joins) are based on
the type 'citext'. (http://gborg.postgresql.org/project/citext/
projdisplay.php). I did compile and add this to my Pg 8
installation, but I am concerned that this might be the root
problem. There seems to be no update on this work since sometime in
2003, and it might not be appropriate to use in 8.0.3. But, it seems
that indexes are being used when I do searches on the individual
tables. So, I'm a bit stumped at the moment.
So, I'd like to ask advice on my query, and what might be the cause.
If 'citext' is the culprit, how can I migrate a very large database
which uses this datatype extensively.
Any help is greatly appreciated.
Thanks,
-albert
===
In Pg 8.0.3--
EXPLAIN select * from snp_quality_overview limit 10;
QUERY PLAN
-
Limit (cost=83734.45..84167.07 rows=10 width=415)
-> Hash Join (cost=83734.45..10949200.55 rows=251154 width=415)
Hash Cond: ("outer".panel_id = "inner".panel_id)
-> Nested Loop (cost=83733.39..10924084.09 rows=251154
width=359)
-> Nested Loop (cost=83733.39..5685331.85
rows=253240 width=324)
-> Hash Join (cost=83733.39..3840930.37
rows=423746 width=291)
Hash Cond: ("outer".assay_id =
"inner".assay_id)
-> Nested Loop (cost=862.47..3680521.72
rows=602829 width=259)
-> Hash Join
(cost=862.47..258437.41 rows=602829 width=195)
Hash Cond:
("outer".genotype_submission_id = "inner".genotype_submission_id)
-> Seq Scan on genotype_set
gs (cost=0.00..218413.11 rows=6626211 width=117)
-> Hash
(cost=861.68..861.68 rows=316 width=94)
-> Nested Loop
(cost=450.77..861.68 rows=316 width=94)
Join Filter:
("inner".lsid = "outer".prot_lsid)
-> Merge Join
(cost=446.02..671.68 rows=358 width=83)
Merge Cond:
("outer".batch_file_id = "inner".batch_file_id)
-> Index
Scan using batch_file_id3 on genotype_submission gsb
(cost=0.00..210.81 rows=3472 width=75)
-> Sort
(cost=446.02..447.31 rows=517 width=20)
Sort
Key: batch_file.batch_file_id
->
Merge Join (cost=23.66..422.72 rows=517 width=20)
Merge Cond: ("outer".monthly_batch_id = "inner".monthly_batch_id)
Join Filter: ("outer".center_name = "inner".center_name)
-> Index Scan using monthly_batch_id on batch_file
(cost=0.00..298.59 rows=5024 width=24)
-> Sort (cost=23.66..24.02 rows=145 width=19)
Sort Key: monthly_batch.monthly_batch_id
-> Seq Scan on monthly_batch (cost=0.00..18.45 rows=145 width=19)
-> Materialize
(cost=4.74..4.97 rows=23 width=72)
-> Hash
Join (cost=1.10..4.72 rows=23 width=72)
Hash
Cond: ("outer".genotyping_platform_id = "inner".genotyping_platform_id)
->
Seq Scan on protocol p (cost=0.00..3.26 rows=26 width=65)
->
Hash (cost=1.08..1.08 rows=8