Hi Tom. Good point about the index paging out of the buffer. I did that and no 
change. I do have the shared buffers at 40GB, so there's a good bit there, but 
I also did all those things on the page you referred, except for using copy. At 
this point the data has not been scrubbed, so I'm trapping data errors and 
duplicates. I am curios though, as sidebar, why copy is considered faster than 
inserts. I was unable to get COPY faster than around 25K inserts a second 
(pretty fast anyway). Frankly, initially I was running 3 concurrent insert jobs 
and getting 90K ins/sec ! but after a certain number of records, the speed just 
dropped off.

pg

From: Tom Lane <t...@sss.pgh.pa.us>
Sent: Wednesday, November 24, 2021 1:32 PM
To: Godfrin, Philippe E <philippe.godf...@nov.com>
Cc: pgsql-general@lists.postgresql.org
Subject: [EXTERNAL] Re: Inserts and bad performance

"Godfrin, Philippe E" 
<philippe.godf...@nov.com<mailto:philippe.godf...@nov.com>> writes:
> I am inserting a large number of rows, 5,10, 15 million. The python code 
> commits every 5000 inserts. The table has partitioned children.
> At first, when there were a low number of rows inserted, the inserts would 
> run at a good clip - 30 - 50K inserts per second. Now, after inserting oh say 
> 1.5 Billion rows, the insert rate has dropped to around 5000 inserts per 
> second. I dropped the unique index , rebuilt the other indexes and no change. 
> The instance is 16 vcpu and 64GB ram.

Can you drop the indexes and not rebuild them till after the bulk load is
done? Once the indexes exceed available RAM, insert performance is going
to fall off a cliff, except maybe for indexes that are receiving purely
sequential inserts (so that only the right end of the index gets touched).

Also see

https://www.postgresql.org/docs/current/populate.html<https://www.postgresql.org/docs/current/populate.html>

regards, tom lane

Reply via email to