Jim C. Nasby wrote:
On Sun, Jul 31, 2005 at 08:51:06AM -0800, Matthew Schumacher wrote:
Ok, here is the current plan.
Change the spamassassin API to pass a hash of tokens into the storage
module, pass the tokens to the proc as an array, start a transaction,
load the tokens into a temp table using copy, select the tokens distinct
into the token table for new tokens, update the token table for known
tokens, then commit.
You might consider:
UPDATE tokens
FROM temp_table (this updates existing records)
INSERT INTO tokens
SELECT ...
FROM temp_table
WHERE NOT IN (SELECT ... FROM tokens)
This way you don't do an update to newly inserted tokens, which helps
keep vacuuming needs in check.
The subselect might be quite a big set, so avoiding a full table scan
and materialization by
DELETE temp_table
WHERE key IN (select key FROM tokens JOIN temp_table);
INSERT INTO TOKENS SELECT * FROM temp_table;
or
INSERT INTO TOKENS
SELECT temp_table.* FROM temp_table LEFT JOIN tokens USING (key)
WHERE tokens.key IS NULL
might be an additional win, assuming that only a small fraction of
tokens is inserted and updated.
Regards,
Andreas
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend