Hello all,
I'm trying to speed up some insert statements. I have been tinkering
with the postmaster and DBI parameters I did some timings on my insert
and copy commands. Here is a sample insert query:
010430.18:31:18.199 [2604] query: insert into log values
(0,0,lower('blah.blah.mydomain.com'),lower('foo'),lower('bar'),lower('blah'),upper('Me'),
upper('Myself'), upper('I'), upper('INFO'), 'String Here', '20010430
16:00:00')
Pretty straightforward. Table log looks like:
Table "log"
Attribute | Type| Modifier
--+---+--
site_id | bigint|
host_id | bigint|
fqdn | varchar() | not null
site | varchar() | not null
region | varchar() | not null
hostname | varchar() | not null
product | varchar() | not null
class| varchar() | not null
subclass | varchar() | not null
status | varchar() | not null
msg | varchar() | not null
remote_stamp | timestamp | not null
tstamp | timestamp | not null
Here are my non-scientific timings:
with AutoCommit on, using DBI across TCP/IP: 1.3 INSERTS/second
with AutoCommit off, DBI, TCP/IP, committing after every 100: 1.6
INSERTS/second
using psql -h host -U user -c "copy log from stdin" dbname < datafile
1.73 rows/second
using COPY LOG FROM 'filename' on the db machine itself: 1.73
rows/second
Another crucial piece of information is that each insert kicks off a
trigger. I did not write the trigger, and do not know how to write
triggers, but I think that might be the contributing factor to the
slowness. Here is the text file used to create the trigger:
drop function update_host_table();
drop trigger incoming_trigger on incoming ;
create function update_host_table()
returns opaque
as 'declare
myrec record;
new_hostid int4;
begin
new.timestamp := now() ;
/* check to see if we have see this machine before */
select * into myrec
from knownhosts k
where k.fqdn = new.fqdn and
k.hostname = new.hostname ;
/* -- if we have not found the machine name we are going to
insert a new record into the knownhosts table and set the init_contact
to now
*/
if not found
then
insert into knownhosts
values (new.fqdn,new.hostname,new.timestamp,new.timestamp) ;
else
update knownhosts
set last_contact = new.timestamp
where knownhosts.fqdn = new.fqdn ;
end if ;
/* now we are going to update the status table with the new record */
select * into myrec
from status s where
s.fqdn = new.fqdn and s.hostname=new.hostname
and s.class=new.class and s.sub_class=new.sub_class ;
if not found
then
insert into status
values (new.fqdn,new.hostname,new.class,
new.sub_class,new.level,new.msg,new.timestamp) ;
else
update status
set level = new.level,
timestamp = new.timestamp
where fqdn=new.fqdn and hostname=new.hostname and
class = new.class and sub_class = new.sub_class ;
end if;
return new;
end ;'
language 'plpgsql';
create trigger incoming_trigger
before insert on incoming
for each row
execute procedure update_host_table();
1.73 INSERTS/second seems awfully slow, but maybe I have set my
expectations too high. Now that you all can see the table and the kind
of data I am trying to put into it, do you have any suggestions? The
hardware specs of the database machine are: Pentium III 733Mhz, 512
megs memory, 7 gigs free on the partition. Seems like I should be
getting a lot more horsepower. I really need to speed this up somehow.
Does anyone see anything in the trigger or otherwise that would cause
this to be so slow?
Thank you very much,
Fran
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster