Hi, I've got a number of files containing generic log data & some of the lines may or may not be duplicated across files that I'm feeding into a database using Perl DBI. I'm just ignoring any duplicate record errors. This is fine for day to day running when the data feeds in at a sensible rate, however, if I wanted to feed in a load of old data in a short space of time, this solution simply is not quick enough.
I can modify the feeder script to generate formated CSV files that I can then COPY into the database into a temporary table. However, I'll then need to select each record from the temporary table and insert into the main table, omitting duplicates. I guess I'd need something like this.... INSERT INTO messages (host, messageid, body, and, loads, more) SELECT host, messageid, body, and, loads, more FROM messages_tmp ; However, when that hit a duplicate, it would fail wouldn't it? Also, would this actually be any quicker than direct insertion from Perl DBI? -- Ian Cass ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster