I was also thinking about adding a 'is_new' column to the table which I would flag as 0, then do a basic copy of all the new rows in with is_new at 1. I'd then do a delete statement to delete all the rows which are duplicate and have a flag of 0 as the copy should leave me some with two rows, one with is_new of 1 and some with 0. Just don't know if this would be best.
On Dec 26, 2007 3:13 PM, Ivan Sergio Borgonovo <[EMAIL PROTECTED]> wrote: > On Wed, 26 Dec 2007 20:48:27 +0100 > Andreas Kretschmer <[EMAIL PROTECTED]> wrote: > > > blackwater dev <[EMAIL PROTECTED]> schrieb: > > > > > I have some php code that will be pulling in a file via ftp. > > > This file will contain 20,000+ records that I then need to pump > > > into the postgres db. These records will represent a subset of > > > the records in a certain table. I basically need an efficient > > > way to pump these rows into the table, replacing matching rows > > > (based on id) already there and inserting ones that aren't. Sort > > > of looping through the result and inserting or updating based on > > > the presents of the row, what is the best way to handle this? > > > This is something that will run nightly. > > > Insert you data to a extra table and work with regular SQL to > > insert/update the destination table. You can use COPY to insert the > > data into your extra table, this works very fast, but you need a > > suitable file format for this. > > What if you know in advance what are the row that should be inserted > and you've a batch of rows that should be updated? > > Is it still the fasted system to insert them all in a temp table with > copy? > > What about the one that have to be updated if you've all the columns, > not just the changed ones? > Is it faster to delete & insert or to update? > > updates comes with the same pk as the destination table. > > thx > > -- > Ivan Sergio Borgonovo > http://www.webthatworks.it > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match >