Josh,

1) To determine your query order ala Dan Tow and drive off of person, please
SET JOIN_COLLAPSE_LIMIT = 1 and then run Mark Kirkwood's version of the
query. (Not that I believe in Dan Tow ... see previous message ... but it
would be interesting to see the results.

Unfortunately, the query still takes 250 milliseconds. I tried it with other queries and the results are the same as before. Here is the explain analayze anyway:


Sort (cost=2036.83..2036.87 rows=16 width=103) (actual time=328.000..328.000 rows=39 loops=1)
Sort Key: cs.date
-> Nested Loop Left Join (cost=620.61..2036.51 rows=16 width=103) (actual time=250.000..328.000 rows=39 loops=1)
-> Hash Join (cost=620.61..1984.90 rows=16 width=78) (actual time=250.000..328.000 rows=39 loops=1)
Hash Cond: ("outer".carrier_code_id = "inner".id)
-> Merge Join (cost=606.11..1965.99 rows=825 width=74) (actual time=250.000..328.000 rows=310 loops=1)
Merge Cond: ("outer".current_status_id = "inner".id)
-> Index Scan using shipment_current_status_id_idx on shipment s (cost=0.00..2701.26 rows=60307 width=66) (actual time=0.000..77.000 rows=27711 loops=1)
Filter: (is_purged = false)
-> Sort (cost=606.11..610.50 rows=1756 width=12) (actual time=141.000..141.000 rows=6902 loops=1)
Sort Key: cs.id
-> Hash Join (cost=1.11..511.48 rows=1756 width=12) (actual time=0.000..109.000 rows=6902 loops=1)
Hash Cond: ("outer".release_code_id = "inner".id)
-> Index Scan Backward using current_status_date_idx on shipment_status cs (cost=0.01..422.58 rows=14047 width=16) (actual time=0.000..78.000 rows=14925 loops=1)
Index Cond: ((date >= (('now'::text)::date - 31)) AND (date <= ('now'::text)::date))
Filter: (cs.* IS NOT NULL)
-> Hash (cost=1.10..1.10 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=1)
-> Seq Scan on release_code rc (cost=0.00..1.10 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1)
Filter: ((number)::text = '9'::text)
-> Hash (cost=14.49..14.49 rows=2 width=8) (actual time=0.000..0.000 rows=0 loops=1)
-> Nested Loop (cost=6.87..14.49 rows=2 width=8) (actual time=0.000..0.000 rows=2 loops=1)
-> Index Scan using person_pkey on person p (cost=0.00..5.73 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1)
Index Cond: (id = 355)
-> Hash Join (cost=6.87..8.74 rows=2 width=8) (actual time=0.000..0.000 rows=2 loops=1)
Hash Cond: ("outer".carrier_id = "inner".carrier_id)
-> Seq Scan on carrier_code cc (cost=0.00..1.57 rows=57 width=8) (actual time=0.000..0.000 rows=57 loops=1)
-> Hash (cost=6.86..6.86 rows=1 width=12) (actual time=0.000..0.000 rows=0 loops=1)
-> Hash Join (cost=3.04..6.86 rows=1 width=12) (actual time=0.000..0.000 rows=1 loops=1)
Hash Cond: ("outer".id = "inner".carrier_id)
-> Seq Scan on carrier c (cost=0.00..3.54 rows=54 width=4) (actual time=0.000..0.000 rows=54 loops=1)
-> Hash (cost=3.04..3.04 rows=1 width=8) (actual time=0.000..0.000 rows=0 loops=1)
-> Index Scan using carrier_to_person_person_id_idx on carrier_to_person ctp (cost=0.00..3.04 rows=1 width=8) (actual time=0.000..0.000 rows=1 loops=1)
Index Cond: (355 = person_id)
-> Index Scan using shipment_status_shipment_id_idx on shipment_status ss (cost=0.00..3.20 rows=2 width=25) (actual time=0.000..0.000 rows=1 loops=39)
Index Cond: (ss.shipment_id = "outer".id)
Total runtime: 328.000 ms


2) Force PG to drop the merge join via SET ENABLE_MERGEJOIN = FALSE;

Setting this option had no effect either In fact, the query is a bit slower (266 milliseconds but 250 came up once in 20 executions).


Also, please let us know some about the server you're using and your
configuration parameters, particularly:
shared_buffers
work_mem
effective_cache_size
random_page_cost

Well, I'm on a test machine so the settings haven't changed one bit from the defaults. This may sound embarrassing, but I bet the production server is not custom configured either. The computer I'm running these queries on is just a simple Athon XP 2100+ on WinXP with 1GB of RAM. The production server is a faster P4, but the rest is the same. Here are the 4 values in my configuration, but 3 of them were commented:


shared_buffers = 1000
#work_mem = 1024
#effective_cache_size = 1000
#random_page_cost = 4

I'm not sure what these do, but I'm guessing the last 2 affect the planner to do different things with the statistics. Should I increase the first two?

Regards,
Ken



---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster

Reply via email to