I've requested this before without response, but I'm asking again because it just caused me pain again: could we get a TODO added to have the planner recognize equivalent IN and EXISTS constructs and have them compete on cost estimates? I know it's not a trivial improvement, but if it's on the list maybe someone will pick it up, and I see it as the single biggest weakness in PostgreSQL performance. I don't need help resolving this particular case, because the fix is always blinding obvious when we hit this, and it doesn't even break portability because no other database we've tested fails to recognize these equivalent cases. step=# explain DELETE FROM "Body" WHERE "bodySeqNo" NOT IN (SELECT "bodySeqNo" FROM "Message"); QUERY PLAN ---------------------------------------------------------------------------------- Seq Scan on "Body" (cost=90277.43..285235351699.39 rows=3313379 width=6) Filter: (NOT (subplan)) SubPlan -> Materialize (cost=90277.43..159793.40 rows=6627957 width=11) -> Seq Scan on "Message" (cost=0.00..80413.07 rows=6627957 width=11) (5 rows)
step=# explain DELETE FROM "Body" WHERE NOT EXISTS (SELECT * FROM "Message" m WHERE m."bodySeqNo" = "Body"."bodySeqNo"); QUERY PLAN -------------------------------------------------------------------------------------------- Seq Scan on "Body" (cost=0.00..3401760.88 rows=3313416 width=6) Filter: (NOT (subplan)) SubPlan -> Index Scan using "Message_Body" on "Message" m (cost=0.00..0.49 rows=1 width=136) Index Cond: (("bodySeqNo")::numeric = ($0)::numeric) (5 rows) The bodySeqNo column is NOT NULL in both tables, and is the primary key in the Body table. The Message table has a non-unique index on it. (\d lists will follow at the bottom.) I cancelled the first query after it had been running for 54 hours over our slowest hours (the weekend). The second form ran in four minutes in competition with peak time queries. -Kevin step=# \d "Body" Table "public.Body" Column | Type | Modifiers -------------+------------------------+----------- bodySeqNo | "SequenceT" | not null contentType | character varying(255) | not null encoding | character varying(255) | body | "BodyT" | Indexes: "Body_pkey" PRIMARY KEY, btree ("bodySeqNo") step=# \d "Message" Table "public.Message" Column | Type | Modifiers -----------------+--------------------------+----------- messageId | "SequenceT" | not null clientMessageId | "ClientMessageIdT" | not null correlationId | "SequenceT" | destQueue | "QueueNameT" | not null replyToQueue | "QueueNameT" | not null typeCode | character(2) | expiration | timestamp with time zone | priority | smallint | not null status | character(2) | not null created | timestamp with time zone | not null lastModified | timestamp with time zone | not null bodySeqNo | "SequenceT" | not null messageIdSearch | "PrioritySequenceT" | not null Indexes: "Message_pkey" PRIMARY KEY, btree ("messageId") "MessageIndex2" UNIQUE, btree ("destQueue", "clientMessageId") "Message_MessageIdSearch" UNIQUE, btree ("destQueue", status, "messageIdSearch") CLUSTER "Message_Body" btree ("bodySeqNo") "Message_Created" btree ("destQueue", status, created) "Message_Created2" btree ("destQueue", created) "Message_Expiration" btree (expiration) "Message_LastModified" btree ("destQueue", "lastModified") "Message_ReplyToQueue" btree ("replyToQueue") Foreign-key constraints: "Message_fk1" FOREIGN KEY ("destQueue") REFERENCES "Queue"(name) "Message_fk2" FOREIGN KEY ("replyToQueue") REFERENCES "Queue"(name) ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match